International Journal of Molecular Epidemiology and Genetics

Int J Mol Epidemiol Genet 2012;3(4):262-275

Original Article
Next generation sequencing of CLU, PICALM and CR1: pitfalls and potential
solutions

Jenny Lord1, James Turton, Christopher Medway, Hui Shi, Kristelle Brown, James Lowe, David Mann,
Stuart Pickering-Brown, Noor Kalsheker, Peter Passmore*, Kevin Morgan*

Human Genetics, School of Molecular Medical Sciences, Queens Medical Centre, University of Nottingham,
Nottingham, UK; Neuropathology, School of Molecular Medical Sciences, Queens Medical Centre, University of
Nottingham, Nottingham, UK; Clinical Neuroscience Research Group, Greater Manchester Neurosciences Centre,
University of Manchester, Salford, UK; Centre for Public Health, School of Medicine, Dentistry, and Biomedical
Sciences, Queen’s University Belfast, Belfast, Northern Ireland, UK. *The Alzheimer’s Research UK Consortium:
Peter Passmore, Bernadette McGuinness, Janet Johnston, Stephen Todd, Queen’s University Belfast, UK; Reinhard
Heun (now at Royal Derby Hospital), Heike Kölsch, University of Bonn, Germany; Patrick G. Kehoe, University
of Bristol, UK; Nigel M. Hooper, Emma R.L.C. Vardy (now at University of Newcastle), University of Leeds, UK; David
M. Mann, University of Manchester, UK; Kristelle Brown, Noor Kalsheker, Kevin Morgan, University of Nottingham,
UK; A. David Smith, Gordon Wilcock, Donald Warden, University of Oxford (OPTIMA), UK; Clive Holmes, University
of Southampton, UK.

Received October 1, 2012; Accepted October 24, 2012; Epub November 15, 2012; Published November 30, 2012

Abstract: CLU, PICALM and CR1 were identified as genetic risk factors for late onset Alzheimer’s disease (AD) in
two large genome wide association studies (GWAS) published in 2009, but the variants that convey this alteration
in disease risk, and how the genes relate to AD pathology is yet to be discovered. A next generation sequencing
(NGS) project was conducted targeting CLU, CR1 and PICALM, in 96 AD samples (8 pools of 12), in an attempt to
discover rare variants within these AD associated genes. Inclusion of repetitive regions in the design of the SureSelect
capture lead to significant issues in alignment of the data, leading to poor specificity and a lower than expected
depth of coverage. A strong positive correlation (0.964, p<0.001) was seen between NGS and 1000 genome project
frequency estimates. Of the ~170 “novel” variants detected in the genes, seven SNPs, all of which were present in
multiple sample pools, were selected for validation by Sanger sequencing. Two SNPs were successfully validated by
this method, and shown to be genuine variants, while five failed validation. These spurious SNP calls occurred as
a result of the presence of small indels and mononucleotide repeats, indicating such features should be regarded
with caution, and validation via an independent method is important for NGS variant calls. (IJMEG1210001).

Keywords: Next generation sequencing, Alzheimer’s disease, genes, CLU, PICALM, CR1

Address all correspondence to:
Dr. Kevin Morgan,
Human Genomics and Molecular Genetics
Institute of Genetics, School of Molecular Medical Sciences
A Floor, West Block, Room 1306, Queens Medical Centre
Nottingham NG7 2UH, United Kingdom.
Tel: 0115 8230724
E-mail: kevin.morgan@nottingham.ac.uk