- 生命经纬知识库

标签: Real-Time Primer

顶[0] 发表评论(20) 编辑词条

H. Simmler and H. Singpiel

Acconovis GmbH

Lindenhofstr. 42-44

68163 Mannheim, Germany

eMail: simmler@acconovis.com

R. M¨anner

Universit¨at Mannheim

B6, 23-29

68131 Mannheim, Germany

maenner@ti.uni-mannheim.de

Abstract

The design of PCR or DNA Chip experiments is a time consuming process where bioinformatics is extensively used. The selection of the primers, which are immobilized on the DNA chip, requires a complex algorithm. Based on several parameters an optimized set of primers is automatically detected for a given gene sequence.This paper describes a parallel architecture which performs the optimization of the primer selection on a hardware accelerator. In contrast to the pure software approach,the parallel architecture gains a speedup of factor 500 using a PCI based hardware accelerator. This approach allows an optimization of a specified primer set in real-time.

1 Introduction

Both, the amplification of DNA sequences using polymerase chain reaction (PCR) and the massive parallel analysis of genes in biological cells using DNA chips (or DNA arrays) have a great impact on modern biological research. PCR is used to amplify a particular DNA fragment called target sequence. In general, a forward and a reverse primer is generated. The target sequence, located between the two primers, is duplicated using a complex process protocol .

DNA chips are used to analyse a large number of genes in parallel. This provides an insight view into cells or can improve the search for gene defects in a particular genome.The DNA chips perform up to 500.000 experiments in parallel and enable the researcher to monitor the whole genome on a single chip at the same time .Although these two applications have different aims ?amplification and analysis ? both techniques make use of primers. Formally, primers are considered as strings that represent a DNA sequence. This DNA sequence consists of four bases represented by the letters {A; G; T;C }. The start of the DNA sequence is denoted by 5’ end and the termination is denoted 3’ end .

Prior to the biological experiments, either PCR or DNA chips, primers have to be designed and synthesized. In general,

primer design is based on several criteria that extend beyond string matching. Typical criteria used for the design are the exact string match, the primer length, the melting temperature, the salt concentration for the experiment and the hybridization effects that have to be taken into account for the selected primers. PCR experiments need only a few different primers whereas several thousand different primers are needed for a DNA chip. The complete processing time for an optimal primer set can take hours taking the various criteria into account.Preparing a DNA experiment can be described as a workflow consisting of three steps.

1. Define the genes that have to be analysed.

2. Design the optimum primer for the gene.

3. Verify the primer in a macroscopic experiment.

Furthermore, the second design step is separated into the computation of the primer sets and a database comparison with each primer. The database check compares the selected primers against the genome database to avoid a “false positive”signal that is not generated by the specified gene. This paper concentrates on the design of the primers.In section two the basics of DNA chips are described.Section three specify various parameters that are used to select the optimal primers. The computation steps performed to select these optimal primers are described in section four.

The fifth section shows the idea of the parallel architecture whereas its implementation is described in section six. The results achieved with the parallel architecture are listed in section seven. The final section provides some conclusions and further applications of the parallel architecture.

2 DNAChips

2.1 Experiments

It is believed that thousands of genes and especially their interactions are responsible for the mystery of life. Before

DNA chips were available, researchers were able to look at only a few genes at the same time.Nowadays, DNA chips provide a complete set of biological experiments on one chip that can be performed simultaneously with one single probe. This enables researchers to have a complete look at a biological cell so that gene interactions or gene defects can be analysed within a short time.The main application fields for DNA chips are gene expression analysis, single polymorphism detection (SNPs), medical diagnostics, gene discovery, drug discovery and toxological studies. For more information see .

2.2 DNA Chip Design

A DNA chip is separated into a matrix of spots. The amount of spots can vary from low density chips with 96 spots up to high density chips with 500.000 spots on one chip. Each spot on a DNA chip contains primers with a unique coding. The primers are immobilized on the spots.As shown in Figure 1 the interesting gene sequence1 binds to the primer. Each primer base binds the corresponding position on the gene sequence. This prevents the gene sequence from swill during the washing process. A fluorescence marker is attached to the probe sequence so that the found sequence can be read out by a DNA chip reader.

Figure 1. Sample spot on a DNA chip.

The used primers vary depending on the chip and the experiments. Usually, the primers are between 20 and 100 bases long and they are manufactured synthetically. See for more information.

3 Primer Design

A biological experiment needs a complete primer set that has to be designed. Choosing the optimal primers for a given target sequence or gene requires the evaluation and comparison of several partly independent parameters against a set of ideal values.The user defines the target sequence and specifies a set of ideal parameter values. Each parameter value consists of an ideal value and a range indicating valid parameter results.The minimum and maximum values are used for filtering.They reduce the amount of primers that have to be analysed.A quality score is computed to select the optimum primer.This score is defined as the sum of all distances between the parameter result to the ideal values.

3.1 Hybridization conditions

The parameters taken into account for selecting each optimal primer set have major influence on the quality of the hybridization process where the primers react with the genes or target sequences. The condition is defined by the parameters which are described in the following paragraphs.

Primer length The primer length defines the amount of bases that build the biological primer. Primarily this length defines the selectivity of the primer. Secondly it has an influence on the melting temperature and the hybridization effects. The primer length is used to generate the primers from the specified sequence windows that are evaluated. See section 4 for more details.

Melting temperature The primer and the searched gene sequence correspond to each other so that each base can bind to its counterpart. The melting temperature is the temperature at which the bonds between primer and gene dissolves.

This is an important parameter for the PCR and also for the DNA chips to avoid bindings that use only a fraction of the available primer and cause a “false positive” signal.The melting temperature for a given primer p =(p1; :::; pn) is calculated with a prominent approximation for the melting temperature . The formula is:

where R = 1:987(cal=ÆC mol) is the molar gas constant,= 50 ×10^-9 is the molar concentration of the primer in its solution, T0 = -237:15℃, and t = -21:6℃ is an empirical temperature correction. The value t may depend upon the ion concentration and other unknown factors.The enthalpy △H(p) and the entropy △S(p) ofthe primer p are computed according to the nearest neighbor schemata[9]andwhere enthalpy and entropy of a string consist of two bases. The used values for the base combinations are listed in the following Table 1. Thevalues in Table 1 refer to the energy required to disrupt the hydrogen bonds of a single base pair of a paired chain. It is assumed to be influenced by neighboring bases. More details can be seen in .

Table 1. Nearest neighbor thermodynamics values

GC content Chemically, hydrogen bonds between the bases of the primer and the gene are responsible for a stable binding. G-C pairs form three hydrogen bonds and are more stable than A-T pairs which form only two hydrogen bonds. Thus, a high GC content results in a greater stability between primer and gene.

The GC content simply measures the amount of GC bases for the primer. The following formula is used:

Secondary structure Above, only the linear sequence, also known as the primary structure, is considered. Beside this primary structure also the secondary structure and its effects have to be taken into account. The secondary structure considers the fact that primers are flexible and that base pairs may bind to each other generating structures [6]. The DNA double helix is one example of a secondary structure. Other important secondary structures for the primer design are primer-primer bindings and hairpin, bulge or internal loops. The secondary structure is an important criterion for the selection of the primer, because a hybridization between base pairs can disable the primer for the actual hybridization.

Interaction between primers either for PCR or for DNA chips must be avoided to conserve the maximum sensitivity of the primer and the spot on the DNA chip.The following four paragraphs show the criteria that are used for the detection of secondary structure effects. The calculation compares the primer sequences and examines the possibility of a hybridization to itself or to another primer. All calculations use the same basic score function for comparing a base pair. This score function is defined as where p = (p1; ::::; pn) and q = (q1; ::::; qm) are the two primers that are analysed for a possible secondary structure.

Self Annealing The self annealing (SA) calculation estimates the possibility of an unintended hybridization to the primer itself. Therefore, the SA score indicates the probability of generating hairpin and internal loops.

The calculation is done using the original primer and an opposite version of the same primer. The following example shows these two primers.

original primer 5’-TTCGTACGAAC-3’

opposite primer 3’-CAAGCATGCTT-5’

Both primers are defined as

original primer p = {p1; :::; pn}

opposite primer q = {q1; :::; qm}

The calculation of the SA score starts with the left shifted opposite primer, where only one overlapping position with the original primer exists. It compares each single overlapping position using the score function 3 and accumulates each single score values to an alignment score. After computing this alignment, the opposite primer is shifted one position to the right and the new alignment score is calculated.This will continue until there is only one overlapping position at the right end of the original primer. The maximum of all alignment scores is used as the SA score for the given primer. The complete function is defined as

The SA score calculation requires the evaluation of all possible alignments. The amount of alignments kSA depend on the primer length and can be calculated using the formula kSA = (n -1) ×(m-1) 1:

The example in Table 2 shows the calculation of the SA score for the primer p = GATTA. The table shows the alignments and the resulting SA scores.

Self End Annealing The self end annealing (SEA) calculation is very similar to the SA calculation but it considers only those alignments where the 3’ end of the original primer belongs to the overlapping region. Furthermore, the Table 2. Self annealing calculation for the example.

SEA score is accumulated only for these overlaps which are continuous. Therefore, the SEA score evaluates the probability of generating hairpin loops or other primer-primer interaction that start from the 3’ end and are continuous.

Because only these alignments are considered where the 3’ end is involved there are less alignments that have to be evaluated. The amount is equal to the primer length.The calculation of the SEA score starts in the shift position where all bases of the original and the opposite primer overlap. All overlapping positions starting from the 3’ end of the original primer are accumulated using the score function 3. In case that the base pair does not match, the accumulation is aborted. After the alignment score is computed,the opposite primer is shifted to the right by one position and the new alignment score is computed. This process continues until only one position at the 3’ end of the original primer overlaps. The SEA score is the maximum of all alignment scores.

The primer from the SA example is taken to show the computation of the SEA score in Table 3. Altogether,5 alignments have to be evaluated for the primer p =GATTA to achieve the SEA score. These alignments and the resulting scores are shown.

In case that a primer binds to itself or to a primer with the same sequence a secondary structure is generated. Secondary structures are also build when several primers are combined for a PCR experiment. Several primers are mixed

Table 3. Self end annealing calculation for the example.

in one tube where each primer can bind to each other and inhibit the reaction with the target sequence or the gene.

Therefore, the effects with different primers must be considered when the optimal primers are selected.

Pair Annealing The pair annealing (PA) calculation takes the interaction of different primers into account and calculates all possible primer pairs. The models and the computing process is similar to the SA tests. Each primer pair is processed using the formula 4.

For PCR the PA scores are calculated for each primer pair that is considered to work together. On DNA chips the PA score is used to compare the primer from each spot against the primers from all other spots.

→如果您认为本词条还有待完善，请编辑词条

上一篇基因组步行法扩增3’及5’侧翼序列下一篇寡核苷酸纯化

词条内容仅供参考，如果您需要解决具体问题
（尤其在法律、医学等领域），建议您咨询相关领域专业人士。本词条对我有帮助 0

同义词：暂无同义词

收藏到:

附件列表

词条信息

相关词条