Vermeille's blog

McAfee, why do you still exist?

Okay, this post is less formal or interesting than usually, I just want to express the hate and rage I have against McAfee.

Dear McAfee, an anti-virus follows one simple concept: while you're installed, keep the computer safe. If you want the user to pay for it, fine. But an anti-virus is not about BLOATING THE USER'S COMPUTER IF HE DOESN'T PAY . So far, with friends and family computers, I've noticed several way you use to fuck up people's computer:

1. Hook when Windows loads a driver, and insert a huge fucking delay in it. Plug something, and the computer freezes for few seconds. GREAT.
2. Make the computer slow as fuck. Just launch 4 threads doing an infinite loop. Amazing, uh?
3. Fuck Windows up so that users cannot access http websites, only https.

Do you have McAfee on your computer? Just fuckin' remove it. Never ever pay for those crooks.

Vim-Eqshow

Vim supporting python scripting makes writing plugin a far more enjoyable experience. In this post, I'm going to discuss briefly about the plugin vim-eqshow I wrote few days ago. I will not discuss the vim API and implementation support, but the algorithms used in it. There's nothing groundbreaking or supersmart but I find enjoyable anyway.

Motivation

I'm currently learning about machine learning, hence the lot of maths I'm writing now (I promise, I'll post soon (I hope) about the deep NLP stuff I'm working on). And it appears that writing maths is absolutely awful, and readability is even worse. Just look at this, logistic regression with L2 regularization in matlab:

J = 1 / m * sum(-y .* log(h) - (1 - y) .* log(1 - h)) + lambda / m * sum(theta2 .^ 2);

Awful, isn't it? My take on this would be to be able to switch vim in a "view mode" where equations are pretty printed to ease reading, then going back in edit mode to continue writing code. And the equational view mode would show the previous equation like this:

        ====                                           ====
1   \                                          λ   \      2
J = - *  >  (-y ⊗ log(h) - (1 - y) ⊗ log(1 - h)) + - *  >  (θ2 )
m   /                                          m   /
====                                           ====

An abstract datastructure

Nothing new under the sun: we need a way to represent our code as datastructures, and the usual AST is here to help. Few Python classes are necessary for a simple toy case: Binop (binary operation), Term (variable or number), Frac (a fraction, handle separately than other Binops because its rendering is really different), Superscript, Parenthesized, and whatever you can come up with.

Simple example:

# Start to write our library

def Term(object):
txt = None
def __init__(txt):
self.txt = txt

def Binop(object):
lhs = None
op = None
rhs = None
def __init__(lhs, op, rhs):
self.lhs = lhs
self.op = op
self.rhs = rhs

def Frac(object):
up = None
down = None
def __init__(up, down):
self.up = up
self.down = down

# [...] Really simple stuff

# And a simple example:
# We represent 1 + 2 / 3 + 4 as
expr = Binop(Binop(Term('1'), '+', Frac(Term('2'), Term('3'))), '+', Term('4'))

The canvas

We're dealing with ASCII art (utf-8, actually, but "Utf-8 art" is not even a thing), so everything is about writing the right characters on the right column and row, one line at a time, without possibility to go back. In Python, we will just fill a list of list of chars, each sub list being a line.

Sizing the canvas

How big should it be? It obviously depends on the equation. To find out, recursively answer the question "how much room do I need to show up?"

• for a Term, there is just one line of length len(self.txt) needed. Example of the bounding box:
    txt
┌──┐
│42│
└──┘
• for a binop, it's a little more complicated. For the width, the operator itself needs one line and typically 3 characters including whitespaces (' + '), plus the width of the lhs, plus the width of the rhs. For the height, the max of the height of the lhs and rhs. Example:
        lhs       op rhs
┌───────────┐
│     1     ├───┬──┐
│-----------│_+_│42│
│1 + exp(-x)├───┴──┘
└───────────┘
• For a fraction, the width needed is the max of width of the numerator and the denominator, and the height is the sum of both plus one for the line.
       ┌─┐
│1│    lhs
┌──┴─┴──┐
│-------│ op
├───────┤
│     -x│ rhs
│1 + e  │
└───────┘

Simply add them recursively, and you have the size of the bounding box. It's that simple. Well, it could be, if the only thing we had to do was to compute the size. But looking a little forward, we can easily guess that we could need a separate count of what's up, and what's down, so that we can vertically center our text. We can also cache the results that will be needed when drawing. The code is thus:


class Term(object):
# As before
def size(self):
self.length = len(self.txt)
return (self.length, 0, 0)

class Binop(object):
# As before
def size(self):
(ax, a_up, a_down) = self.a.size()
(bx, b_up, b_down) = self.b.size()
self.ax = ax
self.a_up = a_up
self.b_up = b_up
return (ax + len(self.op) + bx, max(a_up, b_up), max(a_down, b_down))

class Frac(object):
#As before
def size(self):
(ax, a_up, a_down) = self.top.size()
(bx, b_up, b_down) = self.bottom.size()
self.ax = ax
self.a_up = a_up
self.a_down = a_down
self.bx = bx
return (max(ax, bx), a_up + a_down + 1, b_up + b_down + 1)

Nothing scary, AST traversal.

Drawing

The drawing will happen as follow: the root note is given the coordinate (0, 0) as its top-left corner, will compute where to write its own stuff, and recursively give a subset of the canvas to its children, by changing the coordinate. See the algorithm performing on (1 / 2) + 3. The size needed is 3x5, the box show the bounding box in wich the algorithm is currently drawing

    Binop(..., ' + ', ...)

┌─────┐
│     │
│  +  │
│     │
└─────┘

Binop(Frac(..., ...), ..., ...)
┌─┐
│ │
│-│ +
│ │
└─┘

Binop(Frac(Term('1'), ...), ..., ...)
┌─┐
│1│
└─┘
- +

Binop(Frac(..., Term('2')), ..., ...)

1
- +
┌─┐
│2│
└─┘

Binop(..., ..., Term('3'))

1   ┌─┐
- + │3│
2   └─┘

I don't show the code for this part. Nothing interesting but horrible to read. Feel free to read it in the source code of vim-eqshow.

Problem solved? Uh, not really, we only print the stuff, but not read the code. Only half of the job is done. It's parsing time!

Lexing

Okay, keep this simple: the input we have to parse is really small (one line of code), we can afford storing it all in memory. Our pseudo-lexing phase is really dump:

1. Read the line of code
2. Add an end-of-input marker, like '$' 3. Add spaces around all symbols 4. Split on whitespaces 5. Enjoy symbols = ['+', '-', '/', '*', '(', ')', ', ', '<', '>', '^', ';', '.', '=', "'"] def lex(txt, syms): for s in syms: txt = txt.replace(s, ' ' + s + ' ') txt += '$'
return txt.split()

For our use, it's that simple.

Parsing

We have no choice other than writing a grammar. The easiest parser to implement is LL(0), and we will try to stick to it. Our parser is supposed to read valid input, we take away the burden of detailed error handling or allows to write the grammar in a more permissive form if it simplifies writing it. The grammar is defined as a list of couples: the first element is a the rule, the second is the action to apply if successfully matched.

A rule is composed by these three types of elements:

1. A regexp that must be matched against a string
2. A number i which will ask for recursively matching the i-th rule
3. A list of rules index that will be tried in turn until one matches

And a semantic action is:

1. A function
2. Indexes of the token that should be passed as arguments.
# This grammar is simplified for the sake of clarity

rules = [
# 0 Eq
([1, '=', 0], [Binop, 0, 1, 2]),
# 1 PlusMinus
([2, '\+|-', 1], [Binop, 0, 1, 2]),
# 2 Mul
([3, '\*', 2], [Binop, 0, 1, 2]),
# 3 Div
([[4, 5], '/', 4], [Frac, 0, 2]),
# 4 Atomic
(["(\w+)"], [Term, 0]),
# 5 Paren
(['$$', 1, '$$'], [Paren, 1]),
]

Here, to match the rule 0, we first try to match recursively 1, then the equals sign, then recursively try to call on itself on the rest of the input.

Here again, the algorithm profits from the fact that we already have all the input as a list:

1. Try to match the current grammar rule 2.a. if it didn't match, return without modification 2.b. if it matched, replace the n token involved with the result of the semantic action.
def parse(expr, eidx, rules, ridx):
i = 0
#print(expr, ridx)
for r in rules[ridx][0]:
if type(r) == list:
for subrule in r:
if parse(expr, eidx + i, rules, subrule):
break
elif isinstance(r, int):
parse(expr, eidx + i, rules, r)
elif not isinstance(expr[eidx + i], str) or \
re.search(r, expr[eidx + i]) == None:
return False
i += 1

node = rules[ridx][1]
args = []
for a in node[1:]:
args.append(expr[eidx + a])
expr[eidx : eidx + len(rules[ridx][0])] = [node[0](*args)]
return True

See the algorithm match 1 + 2 / 3 + 4:

    Start:
call trace: []
input: ['1', '+', '2', '/', '3', '+', '4', '$'] index: ^ [...] First token matched: call trace: [Eq in 1, PlusMinus in 2, Mul in 3, Div in [4], Atomic in "\w+"] input: [Term('1'), '+', '2', '/', '3', '+', '4', '$']
index:  ^
[...]

call trace: [Eq in 1, PlusMinus in "+"]
input: [Term('1'), '+', '2', '/', '3', '+', '4', '$'] index: ^ [...] Going back recursively in the rhs, about to match a token: call trace: [Eq in 1, PlusMinus in 1, PlusMinus in 2, Mul in 3, Div in [4], Atomic in "\w+"] input: [Term('1'), '+', Term('2'), '/', '3', '+', '4', '$']
index:                  ^
[...]

Back up in the call stack, we match the '/':
call trace: [Eq in 1, PlusMinus in 1, PlusMinus in 2, Mul in 3, Div in "\*"]
input: [Term('1'), '+', Term('2'), '/', '3', '+', '4', '$'] index: ^ [...] And now we match the rhs of '/' call trace: [Eq in 1, PlusMinus in 1, PlusMinus in 2, Mul in 3, Div in 4, Atomic in "\w+"] input: [Term('1'), '+', Term('2'), '/', Term('3'), '+', '4', '$']
index:                                  ^
[...]

Div successfully matched, replace the tokens with the result of the match
call trace: [Eq in 1, PlusMinus in 1, PlusMinus in 2, Mul in 3]
input: [Term('1'), '+', Frac(Term('2'), Term('3')), '+', '4', '$'] index: ^ [...] About to match the '+': call trace: [Eq in 1, PlusMinus in 1, PlusMinus in '+'] input: [Term('1'), '+', Frac(Term('2'), Term('3')), '+', '4', '$']
index:                                              ^
[...]

Matched, fetching the rhs:
call trace: [Eq in 1, PlusMinus in 1, PlusMinus in 1, PlusMinus in 2, Mul in 3, Div in 4, Atom]
input: [Term('1'), '+', Frac(Term('2'), Term('3')), '+', Term('4'), '$'] index: ^ [...] Going up in the call stack, the deepest PlusMinus matched: call trace: [Eq in 1, PlusMinus in 1, PlusMinus in 1] input: [Term('1'), '+', Binop(Frac(Term('2'), Term('3')), '+', Term('4')), '$']
index:                  ^
[...]

Going up in the call stack, the first PlusMinus matched too:
call trace: [Eq in 1, PlusMinus in 1, PlusMinus in 1]

Sine

It's really straigthforward, generate the ith value using the period, and recurse on the (i+1)th value.

sinW :: Float -> Int -> [Int16]
sinW freq i = sinW' i (getPeriod freq)

sinW' :: Int -> Float -> [Int16]
sinW' i period =
let new = (round $(* 32767.0)$ sin (fromIntegral i * period)) in
new:new:sinW' (i + 1) period

Square

The square waveform is $$A.\mathrm{sign}(\sin(i * \mathit{period}))$$.

squareW :: Float -> Int -> [Int16]
squareW freq i = squareW' i (getPeriod freq)

squareW' :: Int -> Float -> [Int16]
squareW' i period = let new = (round . (* 32767.0) . fromIntegral . sign
$sin (fromIntegral i * period)) in new:new:squareW' (i + 1) period sign x | x < 0 = -1 | otherwise = 1 Sawtooth Algorithmically, this is simple : samplesPerWavelength is the number of data to play during one period, ampStep is the "amount of amplitude" to add between each step. Start at int16_min, add ampStep until we reach int16_max, then restart. sawW :: Int -> Float -> [Int16] sawW i period = sawW' i period (-32767) sawW' :: Int -> Float -> Int -> [Int16] sawW' i period tempSample | i < samplesPerWavelength = let new = fromIntegral tempSample in new:new:sawW' (i + 1) period (tempSample + ampStep period) | otherwise = (-32767):(-32767):sawW' 0 period (-32767 + ampStep period) where samplesPerWavelength :: Int samplesPerWavelength = truncate$ 44100.0 / period
ampStep :: Int
ampStep = (44100 * 2) div samplesPerWavelength period

Triangle

This looks like the sawtooth: samplesPerWavelength is the number of data to play during one period, ampStep is the "amount of amplitude" to add or substract between each step. Start at int16_min, add ampStep until we reach int16_max, then substract ampStep until we reach int16_min etc...

triW :: Float -> Int-> [Int16]
triW period i = triW' period (-32766)
$(44100 * 3) div (samplesPerWavelength period) where samplesPerWavelength :: Float -> Int samplesPerWavelength freq = truncate$ 44100.0 / freq

triW' :: Float -> Int -> Int -> [Int16]
triW' period tempSample ampStep
| abs tempSample > 32767 =
let new = fromIntegral \$ tempSample + ampStep in
new:new:triW' period (tempSample - ampStep) (-ampStep)
| otherwise =
fromIntegral tempSample
:fromIntegral tempSample:triW' period (tempSample + ampStep) ampStep

Playing notes

Okay, now you can play frequencies. You should create a simple array which allows a note number to be converted to a frequency, and you'll be able to play notes.

Wavetable Synthesis

As I promised, here, we do some wavetable synthesis: just play some waveforms a few milliseconds.

int main()
{
double freqs[] = {
#define X(A) A,
# include "freqs.def"
#undef X
};

srand(time(nullptr));

// One theme I composed for Subliminal AEon
int note = 0;
int song[] = {
48, 55, 56, 48, 51, 55, 50, 53,
48, 55, 56, 48, 51, 55, 50, 53,
47, 50, 51, 55, 51, 50, 51, 48,
47, 50, 51, 55, 51, 50, 51, 48
};

while (true)
{
for (size_t i = 0; i < 5; ++i)
{
// Second parameter is the number or milliseconds to play
Sinus(freqs[song[note]], 6);
Square(freqs[song[note]], 5);
Sinus(freqs[song[note]], 7);
Saw(freqs[song[note]], 3);
Triangle(freqs[song[note]], 3);
Sinus(freqs[song[note]], 7);
White(2);
Sinus(freqs[song[note]], 7);
}
note = (note + 1) % (sizeof (song) / sizeof (song[0]));
}
return (0);
}

And just play that using

./a.out | aplay -c 2 -f U16_LE -r 44100

You may have to adjust parameters of aplay depending on your architectures

Going further

New waves

As an example, you can now merge waveforms by adding or multiplying them to create additionnal waveforms. Feel free to create whatever you want when merging.

triPlusSquare :: Float -> [Int16]
triPlusSquareW freq = zipWith addW (triW freq) (squareW (freq * 2))

addW :: Int16 -> Int16 -> Int16
addW x y = (x div 2 + y div 2)

Filters

It sounds really acid. We will give us the possibility (in the next article :D) to use Fourier's transform to create {low,high}-pass filters.

Conclusion

I was really proud to create my own synthesizer, you should do it too :D !

- page 2 of 5 -