Caveat emptor with iOS / HealthKit step data

Step counts have not been recorded uniformly before and after September 2016 on the iPhone, which leads to some artefacts. This slight complication might be interesting to those who intend to analyse long periods of health data.

The change came with an iOS update. Helpfully the exported data points from the Health App contain the current iOS version after iOS 9. Perhaps you can spot the difference pre- and post iOS 10 below. The plot shows steps/second over the years from the same device. Each dot was calculated from one record.

Magnification around the update shows that there are fewer data points post iOS 10:

Continue reading

Hidden messages

Mapp and Lucia aficionados are clearly at an advantage here. This was part of a birthday treasure hunt for someone special: a poster hanging at the Barbican Library, and a transparency in a book at the Guildhall Gallery that, when overlapped, gave a cue.

The poster. Sesquiannual meetings!

The transparency. The parallel bars are a different puzzle.


A small academic cottage industry has improved methods for hiding information so that it is only revealed if innocuous-looking pictures are overlapped. Properly done visual cryptography can offer the strength of a one-time-pad and still be decoded by merely overlapping a pre-defined number of transparencies (or “shares”) with seemingly random or unrelated patterns.

No such pretensions here though, and a very simple method was used: I represented each grayscale pixel in the patent (i.e. obvious) images with a 2×2 black-and-white matrix.  0 – 1/4 – 1/2 – 3/4 – 4/4 of the matrix was black, depending on the darkness of the original pixel. When dithering an image this way, for middle tones there is a choice of different patterns with equal darkness. For example, there are six ways to set half the matrix black: ▚, ▞, ▌, ▐, ▀, ▄. Depending on the pattern combination, two overlapping 1/2 black matrices can be 1/2, 3/4, or 4/4 black:

▚ + ▚ = ▚   or   ▀ + ▀ = ▀

▚ + ▌ = ▙   or   ▀ + ▐ = ▜

▚ + ▞  = █   or  ▀ + ▄ = █

and so on.

Generally a 2×2 matrix in the hidden image can represent any grey that is at least as dark as the most dark of the overlapping two matrices. Continue reading

Our daily bread, equivalent of 11-37 kg of batteries

A back of the envelope calculation on how many kilograms of lithium-ion batteries we would have to carry around to power us for a day — until the next nightly recharge.

Depending on age and sex, about 8-13 MJ energy are needed for a day’s existence, assuming light work. Lithium-ion batteries are widespread in smart phones, electric cars and other electronics not least because of their relatively large specific energy of 0.3-0.7 MJ/kg.

Given this, our battery pack would weigh between 11 and 37 kg for once-a-day recharge.

Having calculated this, how about body fat, our own kind of storage medium? Population average body fat content is around 20 %, and about 10 % body fat is essential. Assuming a body weight of about 80 kg, this leaves about 8 kg storage fat per person or 8 kg * 39.5 MJ/kg = 316 MJ stored energy.

Even in the best case, that are 451 kg batteries to carry.

Newton stood on the shoulders of 14 m tall giants (or less)

If I have seen further it is by standing on the shoulders of giants.

(Newton in a letter to Robert Hooke)

From this we can calculate the giants to be at most 14.3 m tall, assuming they are human-shaped. This is because for seeing further Newton’s eyes must be higher than the eyes of the giant, i.e. his eye height, standing, must be larger than the shoulder-eye-distance of the giants. Sir Isaac is reported to have been five feet six inches (UK) which is about 167.6 cm. Using present-day median values for eye height and shoulder height (see below) for approximate proportions, his eyes were at 155.1 cm. Using this as the shoulder-eye-distance for the giant, by proportions it follows that the giant is at most about 9.198-times taller than Newton, that is about 14.3 meters tall.

FInishing, it is is worth considering that as giants are taller they are probably also proportionately wider and thicker than Newton, so that at maximum they are 778 times heavier than him. When proportionately scaling up body sizes the weight scales by cubes but bone cross section area, which determines their maximal load, only by squares. If giants are subject to biological limits of bone strength, then their bones have at worst only a tenth of the relative strength of Newton’s. Thus such giants can probably best bear their body weight (and Newton’s) when standing neck-deep under water. That however would defeat the purpose.

Detailed calculation:

AVERAGE HUMAN (50th percentiles, in cm)
eye height = 163.26
shoulder height = 144.18
shoulder-eye-distance = 19.08
total height = 175.49
EH:TH = 0.93031
SE:TH = 0.10872414

NEWTON (see here and more here)
with some likelihood five feet six inches  = 167.64 cm, then eye height by proportion = 155.9572

shoulder-eye-distance < 155.9572 then total height by proportion <1434.43.


Footnote: The giants and shoulders metaphor has been used at least since scholasticism.

Filter FASTA files by sequence id using a regular expression

Use a regular expression for filtering sequences by id from a FASTA file, e.g. just certain chromosomes from a genome. There are other tools as part of bigger packages to install (and no regex support), mostly awk-based awkward (sorry for the pun) bash solutions, and scripts using packages that one needs to install and with still no support for regular expressions. This however is a simple, straightforward little python script for a simple task. It doesn’t do anything else and doesn’t need anything but a stock python installation. Based on the FASTA reader snippetContinue reading

A script for qPCR analysis in R

An R script for those who like to be close to their qPCR data and catch problems early. It takes export files (multicomponent data, text format, “across columns”) of Life Technologies StepOne machines.

A standard analysis can be done in less than 5 minutes. It consists of these steps:
– plotting of the raw signal (and saving of the result) to catch odd amplification and strong offsets
– baseline correction
– magnified plotting (and save) to check correction and drift in signal
– cycle estimation by using a threshold
– tabulation of the results

Continue reading

If Nature had done the maths properly …

A Nature report about a new, enzymatic assay of Mycobacterium tuberculosis is (inadvertently) mostly a stark reminder that the false positive and the false negative rates are both important for evaluating an assay’s performance. Superficially, no doubt the assay has advantages: it does not require PCR, prolonged bacterial culture or microscopy, and delivers a result in half an hour unlike current standard methods. It is also sensitive: in a test it flagged all samples positive which microscopy found. Microscopy missed 50% of all positive samples, but even of those missed by microscopy the new method flagged 80% positive. Overall the assay recognises 90% of the Tb+ cases as such.

Despite of these advantages this is not yet a promising method. There is a 27 % false positive rate, i. e. the assay flags a quarter of all tested patients as Tb positive even though there are no Tb-causing bacteria in their samples. This is a problem because only 2-400 / 100 000 people get tuberculosis in any country of the world (World Bank). The new test flags about 27 000 positive out of those 99 600 healthy persons in the population. Continue reading

Poetry in programming: a Quine, suitable for birthdays

A program that emits its own python source code when run, i.e. a Quine.

a="a= ;print a[0:2]+chr(34)+a+chr(34)+a[3:]#Another happy return!";print a[0:2]+chr(34)+a+chr(34)+a[3:]#Another happy return!

A version in particular for biologists with added emphasis on the cycle of life:

a="a= ;print a[0:2]+chr(34)+a+chr(34)+a[3:]#Keep studying the miracle of life!";print a[0:2]+chr(34)+a+chr(34)+a[3:]#Keep studying the miracle of life!



Old microscope + £100 (+ 3D printer) = GFP fluorescent microscope

[News: Someone linked this text from the Open Source Toolkit: Hardware article collection and channel of PLoS. Thanks.]

I built my first fluorescent microscope. A Leitz Labovert became a surprisingly decent fluorescent microscope for DIY after spending about £100 and 3D printing a few custom parts. It is a simplified design with only a barrier and an excitation filter, LED illumination and no dichroic mirror.

The quality is sufficient for observing fluorescent yeast and in general fairly faint Drosophila embryos in low-to-midrange magnification.

Drosophila embryonic tracheal system, GFP-marked. Weak additional bright-field illumination for embryo outline.

Drosophila embryonic tracheal system, GFP-marked. Weak additional bright-field illumination for embryo outline.

Yeast with nuclear GFP marker.

Yeast with nuclear GFP marker.

Fluorescence is short wavelength light in, longer wavelength light out. In principle a fluorescent microscope needs only two parts: 1) a light source that emits light at the excitation, but not at the emission wavelengths of the sample, and 2) a barrier filter, which lets only the emitted longer wavelength light through, so it doesn’t drown out in the excitation light. There are a few details though, and I will start with the barrier filter.

Continue reading

Setting the lower cutoff of a miniprep

It maybe a less known fact that the lower cutoff of PCR cleanup and other DNA minipreps may be dialed in by appropriately diluting the binding buffer with water. PCR cleanup kits usually don’t bind fragments below 50-100 bp, depending on the manufacturer. Using water, dilution of the chaotropic salts in the binding buffer sets this limit higher. DNA smaller than the new limit runs through the coloumn, while higher MW DNA adsorbs as before. This can sometimes save the effort of gel purification. Continue reading