Microblog : A very long article Wikipedia article on the orientation of toilet paper [7 jun à 22:52] [R]

## Samedi, 7 décembre 2013

### The Value of Money in Dodger

Catégories : [ Science ]

Terry Pratchett's Dodger is said to take place in the first quarter of the Victorian Era. We'll assume it is the year 1840. According to the National Archive's currency converter, 1 £ in 1840 is worth about 45 GBP in 2005.

Moreover, the same source indicates that with 100 GBP (about 2£ 5s) you could buy 11 days work of a craftsman wages in the building trade, 3 stones (42 lbs) of wool or 1 quarter (28 lbs) of wheat. As a reference, a person's daily needs in energy are equivalent to about 2 lbs of wheat (representing 2950 kcal according to Wikipedia).

These are the coins mentionned in Dodger:

• Guinea: 21s = 252d (47.25 GBP)
• Sovereign: 1 £ = 20s = 240d (45 GBP)
• Crown: 5s = 60d (11.25 GBP)
• Half-Crown: 2.5s = 30d (5.63 GBP) (about 1.5 lbs of wheat)
• Shilling: 1s = 12d (2.25 GBP)
• Sixpence: 1/2s = 6d (1.13 GBP)
• Groat: 4d (0.75 GBP)
• Thruppence: 3d (0.56 GBP)
• Penny: 1d (0.19 GBP)
• Half-penny: 1/2d (0.10 GBP)
• Farthing: 1/4d (0.05 GBP)

A day worth of a craftman's wages is therfore 4s, which could buy 2.5 lbs of wheat or 4 lbs of wool.

[ Posté le 7 décembre 2013 à 18:06 | pas de commentaire | ]

## Dimanche, 15 septembre 2013

### Instantiating Many Objects in Python

Catégories : [ Informatique ]

I have a list of files in a text file, and I want to load this list into some kind of data structure. The list is quite long, and requires to instantiate 100,000 objects in Python, all of the same type. I found out that depending on what kind of object is used, the time it takes to instantiate all these can vary greatly. Essentially, each line of the file is composed of tab-separated fields, which are split into a list with Python's `str.split()` method. The question therefore is: what should I do with that list?

The object must hold a few values, so basically a list or a tuple would be enough. However, I need to perform various operations on those values, so additional methods would be handy and justify the use of a more complex object.

#### The Contenders

These are the objects I compared:

A simple `list`, as returned by `str.split()`. It is not very handy, but will serve as a reference.

A simple `tuple`, no more handy than the `list`, but it may exhibit better performance (or not).

A class named `List` that inherits from `list`:
```class List(list):
def a(self): return self[0]
def b(self): return self[1]
def c(self): return self[2]
```
A class named `Tuple` that inherits from `tuple`:
```class Tuple(tuple):
def a(self): return self[0]
def b(self): return self[1]
def c(self): return self[2]
```
A class named `ListCustomInitList` that inherits from `List` and adds a custom `__init__()` method:
```class ListCustomInitList(List):
def __init__(self, *args): List.__init__(self, args)
```
A class named `TupleCustomInitTuple` that inherits from `Tuple` and adds a custom `__init__()` method:
```class TupleCustomInitTuple(Tuple):
def __init__(self, *args): Tuple.__init__(self)
```
A class named `ListCustomInit` that inherits from the `list` basic type but has the same features as `ListCustomInitList` instead of inheriting them from the custom `List`:
```class ListCustomInit(list):
def __init__(self, *args): list.__init__(self, args)
def a(self): return self[0]
def b(self): return self[1]
def c(self): return self[2]
```
A class named `TupleCustomInit` that inherits from `tuple` basic type but has the same features as `TupleCustomInitTuple` instead of inheriting them from the custom `Tuple`:
```class TupleCustomInit(tuple):
def __init__(self, *args): tuple.__init__(self)
def a(self): return self[0]
def b(self): return self[1]
def c(self): return self[2]
```
A class named `NamedTuple` that is made from the `namedtuple` type in the `collections` module:
```NamedTuple = namedtuple("NamedTuple", ("a", "b", "c"))
```
A very basic class named `Class` and that inherits from `object`:
```class Class(object):
def __init__(self, args):
self.a = args[0]
self.b = args[1]
self.c = args[2]
```
A variant of the previous that uses the `__slots__` feature:
```class Slots(object):
__slots__ = ("a", "b", "c")
def __init__(self, args):
self.a = args[0]
self.b = args[1]
self.c = args[2]
```
A old-style class, named `OldClass`, that does not inherit from `object`:
```class OldClass:
def __init__(self, args):
self.a = args[0]
self.b = args[1]
self.c = args[2]
```

#### The Benchmark

Each class is instantiated 100,000 times in a loop, with the same, constant input data: `["a", "b", "c"]`; the newly created object is then appended to a list. This process it timed by calling `time.clock()` before and after it and retaining the difference between the two values. The `time.clock()` method has quite a poor resolution, but is immune to the process being set to sleep by the operating systems's scheduler.

This is then repeated 10 times, and the smallest of these 10 values is retained as the performance of the process.

#### The Results

The results from the benchmark are shown relatively the speed of using a simple `list`. As expected, the use of a simple `list` is the fastest, since it requires not additional object instantiation. Below are the results:

• 1.000 list
• 2.455 tuple
• 3.273 Tuple
• 3.455 List
• 4.636 Slots
• 5.818 NamedTuple
• 6.364 OldClass
• 6.455 Class
• 6.909 TupleCustomInit
• 7.091 TupleCustomInitTuple
• 7.545 ListCustomInit
• 7.818 ListCustomInitList

#### Conclusion

One can draw several conclusions from this experiment:

• Not instantiating anything is much faster, even instantiating a simple tuple out of the original list increases the run time by 150%
• The slots feature makes object instantiation 28% faster compared to a regular class
• Deriving a class from a basic type and adding a custom `__init__()` method that calls the parent's `__init__()` adds a lot of overhead (instantiation is 7 to 8 times slower)

[ Posté le 15 septembre 2013 à 15:13 | pas de commentaire | ]

## Jeudi, 21 mars 2013

Catégories : [ Informatique ]

Passwords are difficult to generate and to remember, and once you finally know how to type yours quicky, you don't want to change it. That's usually the time when someone is forcing you to change it… Here is a synthesis of what I've found out about how to generate secure passwords.

#### Entropy as a measure of password strengh

The strength of a password is usually expressed as its entropy, measured in bits. In a nutshell, it expresses the total number of different passwords that can be created (given some construction rules), represented as the base 2 logarithm of that total number. For example, if you know that a password is composed of a single character which may be a letter (uppercase or lowercase), a digit, a white space or a period (which conveniently makes 64 different symbols: 26 lower case letters, plus 26 uppercase letters plus 10 digits plus 2 punctuation symbols), the entropy of that password is 6 bits (because 26 = 64). Non-integer entropy values are valid, so for example a single lowercase letter has an entropy of approximately 4.7 (because 24.7 ≈ 26). The addition of one bit of entropy multiplies the total number of different possible passwords by 2; a password made of 2 characters (64 symbols: upper/lowercase letters, digits and 2 punctuation signs) has therefore an entropy of 12 bits and a password made of 8 lowercase letters has an entropy of 37.6 bits.

#### The human factor

The above entropy measurement is true only if the password is truly randomly generated, and that each symbol has an equal probability of being selected. Humans seem to be rather bad at generating random passwords, and in Special Publication 800-63, the entropy of a human-generated password of length 8 is estimated to have an entropy of 18 bits.

Moreover, if the password is a word from a natural language, the number of possible different passwords is equal to the size of the vocabulary in that language; for English language this is estimated to be between 250,000 words. The entropy of a password made of a single English word is therefore approximately 17.9 bits.

#### Forcing more entropy

To increase the entropy of human-generated passwords, it is quite common to enforce rules, such as a minimum length, the use of more symbols than just the 26 lowercase letters and forbidding the use of common words. The NIST report above estimates that the additional symbols add 6 bits of entropy and the dictionary check adds 5 bits. An 8 character password following all the rules above is therefore estimated to have an entropy of 30 bits. For comparison, a randonly-generated password of 8 character chosen amongst the most common symbols on a computer keyword (80 symbols) has an entropy of 50.6 bits

Such password become however difficult to remember, especially if you have to memorize several of them and are forced to change them every few months.

And they are still pretty insecure.

There are two different methods for cracking a password.

The first method consists in connecting to the service asking for the password, and trying passwords until the right one is found. This method is slow, one can expect to test at most a few dozen of passwords per second (let's say 100 passwords per second). Using the entropy to measure the strength of the attack, that represents 6.6 bits per second, or 23.0 bits/day, or 27.9 bits/month, or 31.5 bits/year.

This gives the following times:

• 18 bits password: at most 45 minutes
• 30 bits password: at most 128 days
• 50.6 bits password: at most 560,000 years

The thing here is that reasonnably secure services will not allow that many trials.

The second method for cracking passwords requires a list of encrypted passwords e.g., stolen from a badly secured service. Depending on the encryption algorithm used with those passwords and the hardware at hand, one can expect an attacker to try between 2,000 and 15,500,000,000 passwords per second (between 11 and 33.8 bits/s) with a standard desktop computer (equipped with a modern GPU).

This gives the following times:

• 18 bits password: at best 128 seconds, at worst a few microseconds
• 30 bits password: at best 6 days, at worst less than a second
• 50.6 bits password: at best 274 years, at worst 32 hours.

#### How many bits are needed?

The times indicated above represent the maximum time needed for cracking the password. There is a 50% chance of cracking it in half that time, and a 10% chance of cracking it in a tenth of that time.

So if a password needs to be safe for at least 1 year, the time needed for cracking it needs to be at least a year i.e., 33.8 + 24.9 = 58.7 bits (entropy of the number of passwords tested per second plus the “entropy” of the number of seconds per year). There is however a chance that the password will be cracked in less time. Adding 1 bit of entropy will reduce the attacker's chance of finding the password in a given time by half, and adding 10 bits reduces it to 1 chance out of 1024 to crack it in that time. 7 bits would reduce it to 1 chance out of 128, which may be sufficient as well.

#### How to generate such a password?

A 68.7 bits password means 15 lowercase letters, or 11 common-keyboard-symbols. These have to be selected by a true random process, such as dice rolls, nuclear desintegration or electronic thermal noise. 6-sided dice are easy to come by, and the Diceware method is probably the easiest one for generating secure and easy-to-remember passwords. A rolls of 5 dice allows to select one word in a list of 7,776, providing 12.9 bits of entropy. The strenght of the password therefore depends on the number of words that are selected (by repeatedly rolling 5 dice):

• 1 word: 12.9 bits, cracked in less than a microsecond
• 2 words: 25.8 bits, cracked in less than 4 milliseconds
• 3 words: 38.3 bits, cracked in less than 30 seconds
• 4 words: 51.6 bits, cracked in less than 2 days
• 5 words: 64.5 bits, cracked in less than 55 years
• 6 words: 77.4 bits, cracked in less than 423,000 years
• 7 words: 90.3 bits, cracked in less than 3231 million years

The Diceware method also allows to add a random non-letter symbol to the password, adding about 9.5 bits of entropy for a 20 character password (about 5 words). Therefore a 5-word password with one random symbol can be considered secure for at least a few years.

#### How long will the password be safe?

Between 2002 and 2011, CPU and GPU computing power has been multiplied by 10 and 100 respectively i.e., +0.37 and +0.74 bits/year regarding password cracking. The rate of growth will probably not remain that high, but if one wants to keep a password for more than a year or two, it should be taken into consideration. For example, if a password must remain safe for the 4 next years, add 3 bits. The 5-word password with one random symbol will therefore be safe for the next 7 years.

One must also consider that computer clusters become affordable, and that a 25-GPU computer has been built exactly for the purpose of cracking passwords. This type of machine adds about 4 bits to capacity of cracking encrypted password (the “second method” above). That makes the 5-word diceware passphrase safe for barely over a year. Finally, cloud computing and parasitic computing using cloud-based browsers may reduce the safety period even further.

#### Conlcusion

The only truly secure passwords are long and truly random; any other method for generating passwords will lead to easily crackable passwords, and is therefore giving a false sense of security. Long enough passwords need to be changed, but not too often; 3 years is a reasonnable lifetime. The Diceware method allows to generate such password in a simple way.

Finally, memorizing a lot of passwords is difficult and induces people to reuse the same passwords. There is a simple solution to that, promoted by Bruce Schneier: write down your password and keep it in your wallet.

[ Posté le 21 mars 2013 à 22:50 | pas de commentaire | ]

## Mardi, 5 mars 2013

### Seven Segment Display

Catégories : [ Science ]

I recently discussed with a friend how to read with a computer the 3-digit numbers from a device using seven-segment displays. One solution we came up with was to put a phototransistor in front of each segment, read the seven on/off signals and recognize the digits. I then wondered if it's possible to use less than seven phototransistors per digit.

A minimum of four segments is obviously required, but after a bit of computer-aided experimentation, I found out that only 5 segments are enough: if you remove the lower and the lower-right segments, you can still identify all ten digits.

So with only 15 inputs instead of 21, you can read the 3 digits, using a 16 bit I/O expander (e.g., a MCP23017; this one even has internal 100 kΩ pull-up resistors, so it may be that nothing else than the phototransistors is needed).

[ Posté le 5 mars 2013 à 22:19 | pas de commentaire | ]

## Mardi, 19 février 2013

### Machine Inutile

Catégories : [ Bricolage ]

Quand on bascule l'interrupteur, un doigt sort de la boite et le rebascule dans l'autre sens. Strictement inutile, et donc parfaitement indispensable.

Vue de l'extérieur, c'est une boite en bois blanc, sans fioriture, à part l'interrupteur sur le couvercle.

Quand on pousse l'interrupteur, voici ce qui se passe:

La vidéo de la boite en action est également disponible.

Sous le capot, un moteur équipé d'une boite de vitesse 1:228, alimenté par 4 batteries rechargeables. Lorsqu'on bascule l'interrupteur, le « doigt » se lève, bascule l'interrupteur dans l'autre sens, ce qui renverse la la polarité et fait tourner le moteur dans le sens inverse. Le « doigt » revient alors à sa position initiale et finit sa course sur un deuxième interrupteur, qui coupe le courant. Le moteur s'arrête.

Mécaniquement et électriquement, il n'y a rien de compliqué, à part le fait de positionner les pièces au bon endroit.

Pour finir, une vidéo de la machine en action couvercle levé. On remarque que le « doigt » sursaute en arrivant sur l'interrupteur, au moment où le courant se coupe, et revient s'y poser une deuxième fois plus doucement, et reste en place.

[ Posté le 19 février 2013 à 21:51 | pas de commentaire | ]