3.0 CNV

The Copy Number Variant (CNV) module of SFARI Gene is a comprehensive, up-to-date collection of all known CNVs that have been associated with ASD. CNVs are segments of DNA, typically greater than 1,000 basepairs in length, that vary in number from person to person. These submicroscopic deletions and duplications are increasingly thought to be involved in the pathogenesis of a wide range of human diseases, including neuropsychiatric disorders such as ASD. For this reason, the CNV module of SFARI Gene serves as a valuable resource for the ASD research community.

3.1 Data Overview and Navigation

The data contained in the CNV module is curated from the latest peer-reviewed research. MindSpec researchers systematically search, collect, and extract information on CNVs from autism case cohorts and, when available, unaffected control cohorts. CNVs in the module are organized based on the locus (chromosomal region or band) in which they were observed in each annotated report.

All of the CNVs identified in the SFARI Gene database can be viewed using the new Copy Number Variant (CNV) Scrubber tool. The table below the CNV Scrubber contains the name of every CNV in the database, along with several columns of the most pertinent information. The dynamic Quick Search tool at the top of the table can be used to immediately filter the table’s contents to help find specific information about a certain CNV.

Data from the module is also accessible via the Human Gene module. If a CNV has been observed in a particular gene, it can be found on the table on a gene’s summary page under the Associated CNVs column.

3.2 Data Visualization

Copy Number Variant (CNV) Scrubber
The CNV Scrubber provides users with a quantitative look at copy number variants that occur in every chromosome. The scrubber shows the number of CNVs found at a particular locus, the number of reports curated, and whether a CNV is primarily caused by deletion or duplication.

The data on CNVs are presented along two parallel axes. The sliding scrubber on top uses vertical bars to show the number of CNVs at each location in every chromosome along the human genome, while the larger window below uses horizontal bars to illustrate the number of overlapping CNVs at every locus. This dual functionality gives users both a broad overview of the most common genetic locations of CNVs and a more in-depth look at the size and frequency of individual CNVs and whether they are attributed to duplication or deletion.

When users hover over a CNV, they will see a brief summary of the CNV including its name. When a CNV is selected, users are taken to a CNV summary page where they can find more in-depth information about the CNV, the reports implicating it to ASD, observed instances in related animal models, and more.

Color filters can be applied to the data viewed with CNV Scrubber to provide additional information about the information presented. The data can be filtered by:

Deletion vs. duplication
Copy number variation is caused by either the duplication or deletion of a gene segment within a given chromosome. In many cases, however, deletions and duplications can be found at the same locus. The CNV Scrubber uses a color gradient to represent the instances in which a CNV has been attributed to deletion, duplication or both.

Number of studies
The data available in the CNV Scrubber can be viewed by the number of studies that identify each CNV. This feature will help users determine, at a glance, the relative frequency of ASD-implicated CNVs found in a particular location.

Number of CNVs
Users can also filter their results to highlight the number of individual CNVs found at each location. As the number of CNVs identified in a study can vary greatly, this view gives users an accurate representation of the individual CNVs that have been catalogued in the database, regardless of how frequently or infrequently they’ve been cited in published research.

3.3 New Features in SFARI Gene 3.0

The contents of the CNV module have been updated to eliminate redundancies, better visualize the data, and ensure the latest information is available to researchers, and the improved interface allows users to see what genes are within a CNV’s range. The newest features of the CNV module include:

The Copy Number Variant (CNV) Scrubber
For more information, please see the Data Visualization section above.

Single CNV Visualization
Every CNV summary page in the module features a visualization of the chromosomal range in which the CNV is known to occur. This new single CNV visualization tool maps the CNV along the corresponding cytogenetic band, measuring its length and location in terms of basepairs. It shows users the minimum and maximum range in which the CNV has been identified, the median starting and end points, and the median length. Buttons at the top of the tool allow users to view either data found in case populations or data found in control populations.

3.4 Quick Search and Advanced Search

Data from the CNV module can be found by using the full site search in the top navigation or the Quick Search or Advanced Search tools. The Quick Search tool below the CNV Scrubber can be used to immediately filter the content of the table. The table is dynamic, updating the information displayed as the user types. The Advanced Search tool, located on the Tools page of SFARI Gene, searches the entirety of the database, and provides additional filtering options to further refine the information found in the search.

3.5 CNV Entry Summary

Each CNV curated into the module has a dedicated entry summary page that contains the following information:

Type – This section shows whether a CNV is caused by deletion, duplication, or both (deletion-duplication).
Average Length – This section shows the average base pair length of the CNV.
Range – This section denotes the range within the genome in which the CNV has been found.
Associated Human Genes – This section lists human genes contained within each CNV interval.
Associated Animal Models – This section lists the titles of any related animal models in which the CNV occurs.
Number of Autism Reports – This section shows the number of autism-specific reports in which a CNV is implicated.
Rare Variants – This section shows the number of rare variants identified.
Common Variants – This section shows the number of common variants identified.
Summary – This section details the unique characteristics of the model.
External Links – This section provides links to major external databases such as Entrez Gene, UniProt, and GeneCards.

In the tabs below, users can find additional information on the individual reports linking a CNV to ASD, a breakdown of which studies observed populations versus those that observed individuals, and associated animal models.

Reports – This section shows the title, author, and year of the studies implicating a CNV in ASD. It also shows whether a report is classified as validated or unvalidated.
Population and Individual Cases – This section shows how many instances of a CNV were found in population studies and how many were found in individual cases.
Associated Animal Models – This section lists any occurrences of the CNV found in a related animal models that have been curated into SFARI Gene. It shows the name of the animal model, the total number of reports, the type of model, and the species in which it was found. If users click these animal model results, they will be taken to the corresponding entry in the Animal Model module of SFARI Gene.

3.6 Populations and Individuals

Populations
Each population (or cohort) in the CNV module dataset is assigned a name (or cohort ID) based on information from the initial report. Cohort names consist of the name of the first author listed in the report, the year the report in which the cohort is described was published, the disease being investigated, whether the cohort is a discovery cohort or a replication cohort, and whether the cohort consists of cases (i.e., individuals diagnosed with the disease of interest) or controls.

While all reports in the database feature a discovery case cohort, only a few also describe a replication case cohort, in which the authors attempt to replicate their findings in the discovery cohort sample with a new population of cases.

For example, for the ASD discovery case cohort described in Pinto D 2010, the name of the cohort in the module would be: pinto_10_ASD_discovery_cases.

For the ASD replication control cohort described in Glessner JT 2009, the name of the cohort would be: glessner_09_ASD_replication_controls.

When available, the following information on both case and control cohorts is extracted and presented in the CNV module as population data:

Cohort ID
External link to the report – This linked text also provides the last name of the first author of the report and the year it was published.
Description – A brief synopsis of the cohort, including the source of the individuals within the cohort.
Cohort size – Case and control cohorts come in a wide range of sizes. Case cohorts of smaller sizes frequently provide more information on the phenotypic characteristics of affected individuals within the cohort but are of less significance in statistically determining the pathogenic relevance of a CNV at a given locus across populations. On the other hand, larger case cohorts are more useful in statistically determining pathogenic CNV relevance, but they typically provide far less information on the phenotypic characteristics of affected individuals.
Diagnosis – The diagnostic criteria (ADI-R, ADOS, etc.) are often described, along with the number of individuals with specific primary diagnoses, such as autism, Asperger’s, or PDD-NOS.
Age – Typically the minimum, maximum, and mean age are provided.
Gender – Males are diagnosed with ASD approximately four times more often than females. Reflecting this disparity, males with autism tend to make up roughly 70-85% of most large case cohorts. Control cohorts, on the other hand, are typically 50% male.
CNV size – The size of the largest CNV found at the locus of interest in a given report.
Deletion vs. duplication – The number of CNVs caused by deletions, duplications, or both.
Total number of CNVs
Geographical ancestry – The majority of cohorts are predominantly of Caucasian/European origin. As such, determining the pathogenic relevance of a CNV at a given locus across ethnic groups is difficult.

Users can also expand the cohort details to reveal the following information:

The average start and end points of the CNV
The gene build of the CNV
A brief description of the validation method(s) used
The primary disorder inheritance
Whether the CNV is inherited or arises de novo
The family profile associated with the CNV
A comprehensive list of genes (including genes that have not been curated into the database) that are contained within the CNV range

Individuals
Many published reports in which copy number variants have been identified include information on the individuals (also referred to as “cases” or “probands”) with autism. When provided, this information is curated and presented in the individual data section.

Each entry in the CNV module dataset is assigned a name (or patient ID) that consists of the name of the cohort to which the individual belongs, followed by a unique identification tag that differentiates that person from other probands in the same cohort. In many cases, the unique identification tag used in the dataset is taken directly from the patient identification tag in the original report. Otherwise, the individual is assigned an identification tag during the annotation process.

For example, for patient 5335_3 from the ASD discovery case cohort described in Pinto D 2010, the name of the individual in the module would be:
pinto_10_ASD_discovery_cases-case5335_3.

When available, the following information is extracted and presented in the CNV module as data from individual reports:

Patient ID
External link to the report – This linked text also provides the last name of the first author of the report and the year it was published.
Clinical profile – This category can potentially contain a broad range of information, depending on the source material, including: clinical history; dysmorphic features; comorbidities commonly associated with ASD, such as ADHD, epilepsy, or sleep disturbances; and growth parameters including height, weight, and head circumference. When included in the published report, ADI-R and/or ADOS scores are also listed. Otherwise, more qualitative measures of core ASD features (deficiencies in social interactions, communication deficits, and repetitive and restricted behaviors) are provided.
Cognitive profile – Individuals with ASD often exhibit a range of intellectual deficits. Information on IQ scores or the extent of mental retardation, intellectual disability, or developmental delay are provided in this category. A cognitive profile may either be qualitative (“average,” “below-average,” etc.) or quantitative (with a numerical score or percentile values), and in some cases the testing methodology is provided.
Primary diagnosis
Patient age and gender
CNV size – The size of the largest CNV found at the locus of interest.
Deletion vs. duplication – The number of CNVs caused by deletions, duplications, or both.
Validation – Confirming whether or not the CNV findings have been validated.

Users can expand the report details to reveal:

The average start and end points of the CNV
The genome build of the CNV
A brief description of the validation method(s) used
The gene content of the CNV

In the expanded view, users can also find information about:

CNV Inheritance
A CNV can arise de novo or be inherited through either the maternal or paternal chromosomes. (In some cases, a CNV can be inherited from both parents.) A de novo CNV spontaneously arises in an individual and is not transmitted from either parent, and there is considerable interest in the importance of de novo CNVs as a significant genetic cause for ASD, especially in simplex families. However, both maternally and paternally inherited CNVs are also believed to confer varying degrees of pathogenic risk. If the origin of a CNV has not been ascertained, then its inheritance is classified in the module as “Unknown.”

CNV-Disease Segregation
Of particular importance in assessing the clinical significance of any given copy number variant is how closely the CNV associates or segregates with the disorder. For example, if a copy number variant is only identified in one or more siblings with autism, but it is not present in any unaffected siblings, the CNV is said to segregate with the disorder. However, if a copy number variant is found in an individual with autism and at least one of his or her unaffected siblings, or if a CNV is present in one sibling with autism but not in another affected sibling, then the CNV-disease association is characterized as “not segregating.” By their nature, de novo CNVs are considered to closely segregate with the disorder.

Family Profile
In many cases, families with individuals who have autism are categorized as either simplex or multiplex. In a simplex family, the proband identified in a CNV report is the only sibling in that family with ASD. (Simplex cases may also be referred to as sporadic cases in some scientific literature.) In a multiplex family, there is at least one sibling with autism in the proband’s family in addition to the individual identified in a CNV report. When such information is provided, the Family Profile is listed as either Simplex or Multiplex. This information is essential in understanding how closely a given CNV co-segregates with the disorder.

3.7 Connections to Other SFARI Gene Modules

The data in the CNV module is connected to some of the other modules of SFARI Gene. Every CNV entry page lists any associated human genes in which it is known to occur. Clicking on these genes will take users to the corresponding entry in the Human Gene module. Similarly, any associated animal models identified in the CNV module will link users to the corresponding entry in the Animal Models module. Data from the CNV module can also be found on single human gene and animal model entry pages; clicking on a CNV listed on those pages will take users to that CNV’s entry in the CNV module.

3.8 Statistics

The information in the CNV module is continuously updated as our dedicated research team curates new genetic information into the database. For the most up-to-date statistics, see the Statistics section of the About CNV page:

Previous section:Human Gene Module

Next section:Animal Models Module

3.0 CNV

Search the user guide

3.1 Data Overview and Navigation

3.2 Data Visualization

3.3 New Features in SFARI Gene 3.0

3.4 Quick Search and Advanced Search

3.5 CNV Entry Summary

3.6 Populations and Individuals

3.7 Connections to Other SFARI Gene Modules

3.8 Statistics