Sunday, September 28, 2014

Long Lab (summer fun edition)

L->R: Arnold Mak, David Filice, Katie Plagens, Tristan Long, Thiropa Balasubramaniam, Mireille Golemiec, Emily Martin

BONUS: GIF!

How to (truly) randomly assign mating treatments - an elegant R approach

This week in the lab we'll be setting up some experiments using our 45 inbred lines of flies (lines were inbred by mating sets of single male an females, derived from the IV line, and subsequently selecting a single brother and sister from the resulting offspring to found the next generation. This process was repeated for >10 generations).

In the experiment we want to randomly pair males and females from different lines, which in R seems pretty simple, as you can use the code

inbred.lines <-c(1:45)
random.mates<- sample(inbred.lines, 45, replace = FALSE)
combos <-cbind(inbred.lines, random.mates)

Which (most of the time) will end up with randomly pairing males and females from different lines....
...However, as this is a random process, there is a chance that R might by chance choose pairs of males and females from the same line. This may not seem a big issue, as you could use a simple logical argument such as

inbred.lines==random.mates

to make sure that there were no matches, and re-run the random sampling if any TRUE values returned by the last command until all pairs were different.

This "brute force" approach is OK, I guess, but becomes much less efficient if we want to place two (or more) males, each selected from a different line in with females. Now we might get a TRUE value if we had a match between the female and "male 1" ,  a match between the female and "male 2", or  a match between "male 2" and "male 1". You can imagine how this problem can get more difficult as the number of combinations increases with the number of males and females in each vial (see Handshake Problem).

So, as I could not find a solution online, I have developed some R code to quickly and elegantly solve this problem.
Let's begin as above

inbred.lines <-c(1:45)

Here is my solution

for (i in 1:length(inbred.lines)){treatmentBB<-sample(inbred.lines,45)}
if (treatmentBB[i]==inbred.lines[i]) {treatmentBB<-sample(inbred.lines,45)} else{treatmentBB}

As you can see in this code, I am creating a new column "treatmentBB" that I am populating with 45 randomly sampled numbers (with no replacement) from the inbred.lines vector. The next step is to ask R to check if any of the rows match. If they do, then we ask R to start all over again, but if there are no matches, then to leave treatment BB alone.

Now let's expand this to see if we wanted to add a second (random male) into each treatment, by creating a column of values called treatmentEI. As you can see below, I have taken into account potential matches between "females and males from treatmentBB", "females and males from treatmentE1", and "males from treatment E1 and males from treatmentBB"

for (i in 1:length(inbred.lines)){treatmentE1<-sample(inbred.lines,45)}
if (treatmentE1[i]==inbred.lines[i]|treatmentE1[i]==treatmentBB[i]) {treatmentE1<-sample(inbred.lines,45)} else{treatmentE1}
Created by Pretty R at inside-R.org

and If I wanted to add a third, I would need to make sure that all possible matches are accounted for... 

for (i in 1:length(family)){treatmentE2<-sample(family,45) if (treatmentE2[i]==family[i]|treatmentE2[i]==treatmentBB[i]|treatmentE2[i]==treatmentE1[i]) {treatmentE2<-sample(family,45)} else{treatmentE2}}

Hope you find this useful!
TL