Alkes Price
The talk is on "De Novo Identification of Repeat Families in Large Genomes", Alkes Price is giving the presentation. The slides are available here. A repeat family is a collection of similar sequence which appear many times in the genome e.g. Alu repeats. Pull out Alu sequences, align them, consensus. We don't know the regions, we don't know the boundaries, repeats don't appear of full copies only partial. Eddy concludes that the problem is messy.
Why do this ? Repeats are biological meaningful, genome rearrangements, drivers of evolution etc. For pragmatic reasons we need repeat masking. Why ? to do comparative genomics. You need to mask repeats before alignment, RepeatMasker is effective only if you know the library of repeats. So how do you identify the repeat families in large genomes.

