Explore the state of the UK, October 2017

Posted on November 10, 2017 by dm

An earlier post explored the winning and losing parts of London, as measured by the success of different kinds of cheap and expensive food-selling enterprises.

Assayed the same way (with the same shortcomings, too!), how did the rest of the UK do? A first answer is: outside London, not many places have done very well, but the Sheffield area is a clear winner. Less densely populated areas are the ones losing most.

Everyone has different questions though: are cheap or expensive venues becoming more successful where I (want to) live? What kind of shops are opening in the South-West? Which parts of the country is Pizza Chain X focusing on? In the linked interactive map I created you can look for yourself. The map answers both questions about the whole of the UK, and about favourite counties, cities or boroughs (just zoom in!).

As a practical note, sometimes a far out region appears surprisingly full of activity. It is worth double-checking this. It may be because the local authority dumped or purged a lot of businesses at the same time (usually in smaller places). It helps to just shift this area to one side, and then the heat map will return to a more useful scale for the rest.

The list below the map shows the coming and going businesses in the area you are looking at. In the left pane you can filter by business name or type, e.g. if you were wondering what supermarkets or specifically Tesco does in a particular area.

Feedback is most welcome below this post. For the future I am considering adding comparisons between different time periods to make this tool even more useful. Please be kind to the tiny server if it is slow. My thanks to the makers of R, Shiny, ggmap and leaflet, and to EC2 for the server.

New and closing restaurants and sandwich shops in the UK.

Combining probability distributions

Posted on October 26, 2017 by dm

This table is yet unfinished.

If I observe the sum of two processes with known distributions, the distribution of the observations is expected to be …

+	Normal	Poisson	Binomial	Uniform
Normal	Normal	See here
Poisson		Poisson, normal if many summands
Binomial			Binomial, if common p, Poisson, if many summands, Poisson binomial otherwise, note also the binomial sum variance inequality and this.
Uniform				Irwin-Hall

Wikipedia has some more. A general discussion of probability distribution convolutions is for example here.

Feature detectors in animal vision

Posted on October 12, 2017 by dm

Image feature detectors are a common concept between mammalian vision and computer vision. When using them, a raster image is not directly processed to identify complex objects (e.g. a flower, or the digit 2). Instead feature detectors map the distribution of simple figures (such as straight edges) within the image. Higher layers of the neural network then use these maps for distinguishing objects.

In the mammalian brain’s visual cortex (which is at the back of the head, at the furthest possible point from the eyes) the image on the retina is recreated as a spatially faithful projection of the excitation pattern on the retina. Overlapping sets of feature detectors use this as input.

From eyeball to visual cortex in humans. Note the Ray-Ban-shaped area at the back of the brain where the retinal excitation pattern is projected to with some distortions. (From Frisby: Seeing: the computational approach to biological vision (2010), p 5)

How we know about retinotopic projection to the visual cortex: an autoradiography of a macaque brain slice shows in dark the neurons that were most active in result of the animal seeing the image on top left. (From Tootell et al., Science (1982) 218, 902-904.)

A feature detector neuron becomes active when its favourite pattern shows up in the projected visual field – or more exactly in the area within the visual field where each detector is looking. A typical class of detectors is specific for edges with a specific angle, where one side is dark, and the other side is light. Other neurons recognise more complex patterns, and some also require motion for activation. These detectors together cover the entire visual field, and their excitation pattern is the input to higher layers of processing. We learned about these neurons first by sticking microelectrodes into the visual cortex and measuring electrical activity. When lucky, the electrode measured the activity of a single neuron; then by showing different visual stimuli the activation pattern of the neuron could be mapped.

A toad’s antiworm detector neuron reacts to a stripe moving across its receptive field. The antiworm may move in any direction, but only crosswise for the neuron to react. The worm detector, for comparison, would react if the stripe moves lengthwise. Toad at the right side with microelectrode, the oscillogram above the screen shows the tapped feature detector neuron’s activity. (Segment from Jörg-Peter Ewert, Gestalt Perception in the Common Toad – 3. Neuroethological Analysis of Prey Recognition.)

Continue reading →

The state of London, October 2017

Posted on October 8, 2017 by dm

Brixton, and from there a corridor towards the Thames ending at Vauxhall and Elephant and Castle are winning. The areas around Islington, Hackney and Greenwich are struggling. The Soho and the rest of Westminster keep doing well. This is at least what starting and failing food-related businesses tell about the last six months in London. I felt food is something everyone buys daily, and whether it is cheap or expensive, less or more, is a good indication of socioeconomic developments.

Increase and decrease of food-related businesses in London over the six month up to October 2017. (Map backgrounds are courtesy of Google Maps. Overlays: R, ggmaps.)

Continue reading →

A quick way to fit an origin line to a Poisson point cloud

Posted on September 25, 2017 by dm

Just as a quick note, sometimes there is a more quick way to estimate the parameter of a Poisson model from data than a generalised linear model (via e.g. R’s glm function). This is the case when the expected mean λ is just a straight line that starts at 0 at time 0: $$\lambda(t) = gt.$$ This can model for example the number n(t) of a steadily produced mRNA species in a cell after the enhancer becomes active for the first time: the sum of two Poisson-distributed values with means λ₁ (already existing number), λ₂ (production during next time slice) is also Poisson-distributed with mean $$\lambda_1+\lambda_2$$

In this case the maximum likelihood or Bayesian estimate (they are the same, assuming no particular prior knowledge) for g is simply

$$ g=\frac{\sum{n}}{\sum{t}} $$

This is because the probability of a single Poisson event with λ=gt is Continue reading →

Caveat emptor with iOS / HealthKit step data

Posted on September 5, 2017 by dm

Step counts have not been recorded uniformly before and after September 2016 on the iPhone, which leads to some artefacts. This slight complication might be interesting to those who intend to analyse long periods of health data.

The change came with an iOS update. Helpfully the exported data points from the Health App contain the current iOS version after iOS 9. Perhaps you can spot the difference pre- and post iOS 10 below. The plot shows steps/second over the years from the same device. Each dot was calculated from one record.

Magnification around the update shows that there are fewer data points post iOS 10:

Continue reading →

Hidden messages

Posted on November 25, 2015 by dm

Mapp and Lucia aficionados are clearly at an advantage here. This was part of a birthday treasure hunt for someone special: a poster hanging at the Barbican Library, and a transparency in a book at the Guildhall Gallery that, when overlapped, gave a cue.

The poster. Sesquiannual meetings!

The transparency. The parallel bars are a different puzzle.

A small academic cottage industry has improved methods for hiding information so that it is only revealed if innocuous-looking pictures are overlapped. Properly done visual cryptography can offer the strength of a one-time-pad and still be decoded by merely overlapping a pre-defined number of transparencies (or “shares”) with seemingly random or unrelated patterns.

No such pretensions here though, and a very simple method was used: I represented each grayscale pixel in the patent (i.e. obvious) images with a 2×2 black-and-white matrix. 0 – 1/4 – 1/2 – 3/4 – 4/4 of the matrix was black, depending on the darkness of the original pixel. When dithering an image this way, for middle tones there is a choice of different patterns with equal darkness. For example, there are six ways to set half the matrix black: ▚, ▞, ▌, ▐, ▀, ▄. Depending on the pattern combination, two overlapping 1/2 black matrices can be 1/2, 3/4, or 4/4 black:

▚ + ▚ = ▚ or ▀ + ▀ = ▀

▚ + ▌ = ▙ or ▀ + ▐ = ▜

▚ + ▞ = █ or ▀ + ▄ = █

and so on.

Generally a 2×2 matrix in the hidden image can represent any grey that is at least as dark as the most dark of the overlapping two matrices. Continue reading →

Our daily bread, equivalent of 11-37 kg of batteries

Posted on August 20, 2015 by dm

A back of the envelope calculation on how many kilograms of lithium-ion batteries we would have to carry around to power us for a day — until the next nightly recharge.

Depending on age and sex, about 8-13 MJ energy are needed for a day’s existence, assuming light work. Lithium-ion batteries are widespread in smart phones, electric cars and other electronics not least because of their relatively large specific energy of 0.3-0.7 MJ/kg.

Given this, our battery pack would weigh between 11 and 37 kg for once-a-day recharge.

Having calculated this, how about body fat, our own kind of storage medium? Population average body fat content is around 20 %, and about 10 % body fat is essential. Assuming a body weight of about 80 kg, this leaves about 8 kg storage fat per person or 8 kg * 39.5 MJ/kg = 316 MJ stored energy.

Even in the best case, that are 451 kg batteries to carry.

Newton stood on the shoulders of 14 m tall giants (or less)

Posted on February 27, 2015 by dm

If I have seen further it is by standing on the shoulders of giants.

(Newton in a letter to Robert Hooke)

From this we can calculate the giants to be at most 14.3 m tall, assuming they are human-shaped. This is because for seeing further Newton’s eyes must be higher than the eyes of the giant, i.e. his eye height, standing, must be larger than the shoulder-eye-distance of the giants. Sir Isaac is reported to have been five feet six inches (UK) which is about 167.6 cm. Using present-day median values for eye height and shoulder height (see below) for approximate proportions, his eyes were at 155.1 cm. Using this as the shoulder-eye-distance for the giant, by proportions it follows that the giant is at most about 9.198-times taller than Newton, that is about 14.3 meters tall.

FInishing, it is is worth considering that as giants are taller they are probably also proportionately wider and thicker than Newton, so that at maximum they are 778 times heavier than him. When proportionately scaling up body sizes the weight scales by cubes but bone cross section area, which determines their maximal load, only by squares. If giants are subject to biological limits of bone strength, then their bones have at worst only a tenth of the relative strength of Newton’s. Thus such giants can probably best bear their body weight (and Newton’s) when standing neck-deep under water. That however would defeat the purpose.

Detailed calculation:

AVERAGE HUMAN (50th percentiles, in cm)
eye height = 163.26
shoulder height = 144.18
shoulder-eye-distance = 19.08
total height = 175.49
EH:TH = 0.93031
SE:TH = 0.10872414

NEWTON (see here and more here)
with some likelihood five feet six inches = 167.64 cm, then eye height by proportion = 155.9572

GIANT
shoulder-eye-distance < 155.9572 then total height by proportion <1434.43.

Footnote: The giants and shoulders metaphor has been used at least since scholasticism.

Filter FASTA files by sequence id using a regular expression

Posted on December 12, 2014 by dm

Use a regular expression for filtering sequences by id from a FASTA file, e.g. just certain chromosomes from a genome. There are other tools as part of bigger packages to install (and no regex support), mostly awk-based awkward (sorry for the pun) bash solutions, and scripts using packages that one needs to install and with still no support for regular expressions. This however is a simple, straightforward little python script for a simple task. It doesn’t do anything else and doesn’t need anything but a stock python installation. Based on the FASTA reader snippet. Continue reading →

David Molnar [Update:, PhD]

Interesting things.

Author Archives: dm

Explore the state of the UK, October 2017

Combining probability distributions

Feature detectors in animal vision

The state of London, October 2017

A quick way to fit an origin line to a Poisson point cloud

Caveat emptor with iOS / HealthKit step data

Hidden messages

Our daily bread, equivalent of 11-37 kg of batteries

Newton stood on the shoulders of 14 m tall giants (or less)

Filter FASTA files by sequence id using a regular expression