專利名稱::實現(xiàn)改進(jìn)的多肽表達(dá)的方法
技術(shù)領(lǐng)域:
:本發(fā)明涉及在宿主細(xì)胞中生產(chǎn)多肽的方法,其中根據(jù)宿主細(xì)胞的密碼子使用(codon-usage)、尤其是所使用的密碼子對(codon-pair)對編碼多肽的核苷酸序列進(jìn)行了修飾,從而獲得改進(jìn)的編碼多肽的核苷酸序列的表達(dá)和/或改進(jìn)的多肽生產(chǎn)。
背景技術(shù):
:本發(fā)明涉及用于生產(chǎn)多肽的改進(jìn)方法。大量途徑已被用于產(chǎn)生用于蛋白質(zhì)過表達(dá)和/或生產(chǎn)的菌株。這包括但不限于,制造下述菌株和應(yīng)用強啟動子序列,所述菌株具有多個拷貝的編碼感興趣的蛋白質(zhì)(POI)的基因。每個特定氨基酸最少由一個密碼子、最多由六個密碼子編碼。先前的研究已顯示不同物種間編碼細(xì)胞的多肽的基因中密碼子使用是有偏向的(Kanaya,S,Y.Yamada,Y.KudoandT.Ikemura(1999)StudiesofcodonusageandtRNAgenesat18unicellularorganismsandquantificationof5acz'〃ws■smZ"/^tRNAs:geneexpressionlevelandspecies-specificdiversityofcodonusagebasedonmultivariateanalysis.Gewe238:143-155)。先前的出版物公開了對給定宿主細(xì)胞中密碼子使用的優(yōu)化(optimization),以改進(jìn)多肽生產(chǎn)(以WO97/11086為例)。更具體地,WO03/70957描述了絲狀真菌中用于生產(chǎn)植物多肽的優(yōu)化密碼子使用。在所有這些"經(jīng)典的"密碼子優(yōu)化情況中,天然的密碼子已被來自基因的參考組的最常用的密碼子所取代,而針對每個氨基酸的密碼子翻譯率被設(shè)計為高(優(yōu)化的)。更近來,在WO03/85114中描述了對密碼子使用的調(diào)和(harmonization),其影響宿主生物的基因中所有密碼子的分布,假定這些影響蛋白質(zhì)折疊。7最近數(shù)年中,許多生物如5fl"〃"ssw6rtfc(Kunstetal.1997)、"fl"7/w,Z/o/一咖c^w、^pe/^'〃1^1(Peletal.,2007,NatBiotech.21:221-231)、《—vera,ces/acto、5"accAflram少cescerevz、/ae(http:Vwww.yeastgenome.org/)、多種植物基因組、小鼠、大鼠和人的經(jīng)完全測序的基因組的可用性提供了相對于基因序列的天然表達(dá)水平(mRNA或蛋白質(zhì)水平)來分析其自身的不同方面的可能性。一個良好的例子是密碼子使用(偏向性)分析和后續(xù)的單個密碼子優(yōu)化。應(yīng)當(dāng)注意單個密碼子優(yōu)化在本文中被理解為表示密碼子優(yōu)化或密碼子調(diào)和技術(shù),所述技術(shù)關(guān)注密碼子作為單個獨立實體的優(yōu)化,這與本發(fā)明主題的密碼子對優(yōu)化相反。盡管單個密碼子使用(偏向性)之前己經(jīng)被廣泛研究(綜述見Gustafssonetal.,2004,TrendsBiotechnol.22:346-353),但是對關(guān)于密碼子對選擇和密碼子對優(yōu)化的報道很少。已經(jīng)例如針對AGG-AGG密碼子(SpanjaardandvanDuin,1988,Proc.Natl.Acad.Sci.USA^:7967-7971;Gurvichetal.,2005,J.Bacteriol.巡:4023-432)和針對UUU-YNN位點(SchwarzandCurran,1997,NucleicAcidsRes.^:2005-2011)研究了少量特定的密碼子對對五.co//中核糖體移碼的影響。Gutman和Hatfield(1989,Proc.Natl,Acad.SciUSA巡:3699-3703)分析了用于五.co/,的所有可能的密碼子對的更大的序列集合,并且發(fā)現(xiàn)密碼子對是有定向偏向的。另外,他們觀察到高度不足量表現(xiàn)(underrepresented)的對幾乎是高表達(dá)的基因中過量表現(xiàn)的(overrepresented)對使用頻率的兩倍,而在弱表達(dá)的基因中,過量表現(xiàn)的對被更頻繁地使用。US5,082,767(HatfieldandGutman,1992)公開了下述方法,所述方法測定生物中相對天然的密碼子對偏好,并根據(jù)所述密碼子對偏好改變感興趣的基因的密碼子對從而以預(yù)定的方式改變所述基因的翻譯動力學(xué),有針對EC0/Z和cerev^ae的例子。然而,在他們的方法中,Hatfield和Gutman僅優(yōu)化了個別的相鄰密碼子對。另外,在他們的專利(US5,082,767)中要求通過被修飾的序列提高至少一部分基因的翻譯動力學(xué),所述被修飾的序列中密碼子8對被改變以增加下述密碼子對的數(shù)量,所述密碼子與隨機密碼子對使用相比是生物中更加大量并且更加不足量表現(xiàn)的密碼子對。本發(fā)明公開了通過被修飾的序列提高翻譯的一種方法,所述被修飾的序列中密碼子對被改變,以增加下述密碼子對的數(shù)量,所述密碼子與隨機密碼子對使用相比是生物中更加過量表現(xiàn)的密碼子對。Mouraetal.(2005,GenomeBiology,《:R28)分析了整個《S.cem^sa6ORF組,但是沒有發(fā)現(xiàn)約47%密碼子對的統(tǒng)計學(xué)顯著的偏向性。各自的數(shù)值在物種間差異,導(dǎo)致可以被稱作密碼子對使用的"物種特異指紋"的"密碼子背景圖(codoncontextmap)"。Boychevaetal.(2003,BioinformaticsiH8):987-998)通過在具有高和弱表達(dá)的基因中尋找過量表現(xiàn)的和不足量表現(xiàn)的密碼子對,鑒定了五.co"中被稱作假定衰減的(hypotheticallyattenuating)和假定非衰減的兩個密碼子對集合。然而,他們未提出應(yīng)用該發(fā)現(xiàn)的方法,也未給出對他們的假設(shè)的任何實驗證明。應(yīng)當(dāng)注意,這些組被定義為與Gutman和Hatfield(1989,1992:上文)定義的組(他們提出了高表達(dá)的基因中高度不足量表現(xiàn)的對的非衰減效應(yīng))完全相反。Buchan、Aucott禾口Stanfield(2006,M/c/e/c爿c/山T^sewc/MP):1015-1027)分析了相對密碼子對偏向性而言的tRNA特性。對于偏向性在密碼子對利用中的牽連而言,Irwinetal.(1995,J.Biol.Chem.270:22801-22806)證明了在五.co/Z中當(dāng)用高度過量表現(xiàn)的密碼子對替換高度不足量表現(xiàn)的密碼子對時,合成率實際上大量降低,當(dāng)將輕微不足量表現(xiàn)的密碼子對替換為更高度不足量表現(xiàn)的密碼子對時,合成率提高。這是非常顯著的,因為這與人們考慮到單個密碼子偏向性對蛋白質(zhì)水平的影響后所預(yù)期的有些相反。然而,上述領(lǐng)域均未公開如何考慮下述事實來優(yōu)化全長密碼子序列的密碼子對使用,所述事實為密碼子對通過定義重疊,和每個個體密碼子對因此影響重疊的上游和下游密碼子對的偏向性。另外,上述領(lǐng)域均未公開將單個密碼子以及密碼子對二者的優(yōu)化組合的方法??紤]到下述密碼子對優(yōu)化與單個密碼子優(yōu)化的任選的組合和所述密碼子對重疊的密碼子對優(yōu)化9會大幅改進(jìn)編碼感興趣的多肽的核苷酸序列的表達(dá)和/或改進(jìn)所述多肽的生產(chǎn)。因此,本領(lǐng)域仍然需要新穎的方法,所述方法用于在宿主細(xì)胞中優(yōu)化編碼序列以改進(jìn)多肽的生產(chǎn)。發(fā)明概述本發(fā)明的一個目的是提供下述方法,所述方法針對有效的基因轉(zhuǎn)錄和蛋白質(zhì)翻譯來優(yōu)化編碼序列。為達(dá)到此目的,本發(fā)明提供了編碼預(yù)定的氨基酸序列的核苷酸序列的優(yōu)化方法,其中所述編碼序列針對在預(yù)定的宿主細(xì)胞中的表達(dá)被優(yōu)化,所述方法包括(a)產(chǎn)生至少一條編碼預(yù)定的氨基酸序列的原始編碼序列;(b)通過用同義密碼子替換該至少一條原始編碼序列中的一個或多個密碼子,從該至少一條原始編碼序列產(chǎn)生至少一條新產(chǎn)生的編碼序列;(c)測定所述至少一條原始編碼序列的適合度值和所述至少一條新產(chǎn)生的編碼序列的適合度值,同時使用下述適合度函數(shù),所述函數(shù)針對預(yù)定的宿主細(xì)胞至少測定單個密碼子適合度和密碼子對適合度之一;(d)根據(jù)預(yù)定的選擇標(biāo)準(zhǔn),在所述至少一條原始編碼序列和所述至少一條新產(chǎn)生的編碼序列中選擇一條或多條選定的編碼序列,所述適合度值越高,被選擇的機會越高;和(e)重復(fù)動作b)到d),同時在動作b)到d)中將所述一條或多條選定的編碼序列作為一條或多條原始編碼序列處理,直至滿足預(yù)定的迭代終止標(biāo)準(zhǔn)(iterationstopcriterion)。在一些實施方案中,本發(fā)明涉及例如單個密碼子使用、密碼子調(diào)和、二核苷酸使用的方面,并涉及密碼子對偏向性。該方法可以通過在下述計算機上運行的計算機程序進(jìn)行,所述計算機程序使用可以在MATLAB(http://Www.mathWOrkS.COm/)中完成的序列分析和序列最優(yōu)化的數(shù)學(xué)算法。除了正的密碼子最優(yōu)化(例如以正的方式調(diào)控基因表達(dá)和蛋白質(zhì)生產(chǎn))以外,本發(fā)明還提供了使密碼子適應(yīng)"不良"密碼子對的方法(即負(fù)的密碼子對最優(yōu)化)。后一方法適用于對照目的以及以負(fù)的方式調(diào)控基因表達(dá)。附圖概述應(yīng)當(dāng)觀察到,將參考下述若干附圖來闡述本發(fā)明,所述附圖僅旨在闡述本發(fā)明,而不限制本發(fā)明的范圍,本發(fā)明的范圍僅由附帶的權(quán)利要求書及其等價物所定義。圖l顯示了計算機裝置,本發(fā)明的方法可以在所述裝置上進(jìn)行。圖2顯示了本發(fā)明的實施方案的流程圖。圖3顯示了不同生物中3,721個正義:正義密碼子對的密碼子對偏向性數(shù)值分布。每個直方圖右上角的數(shù)字是觀察到的分布的標(biāo)準(zhǔn)差;對所有生物而言均值(未顯示)在-0.06和-0.01之間。圖4顯示了多種生物的密碼子對偏向性的相關(guān)性。相關(guān)系數(shù)在每個小圖的右上角顯示。圖5顯示了A的密碼子偏向性圖譜。偏向性數(shù)值范圍從-0.67到0.54,而在其它生物中它們甚至可以稍高于+-0.9(也見圖3)。這些圖表中最高強度的黑色代表0.9(圖5A和5C為正值,原始為綠色)和-0.9(圖5B和5D為負(fù)值,原始為紅色)的數(shù)值。在圖5A和B中,行和列根據(jù)它們密碼子的字母順序排列。在圖5C和5D中,行根據(jù)下述的字母順序排列第三位核苷酸作為首要的排列標(biāo)準(zhǔn),中間位置核苷酸作為次要排列標(biāo)準(zhǔn),第一位核苷酸作為第三排列標(biāo)準(zhǔn)。圖6顯示了及仰M/m的密碼子偏向性圖譜。偏向性數(shù)值范圍從-0.97到0.87,而在其它生物中它們甚至可以稍高于+-0.9(也見圖3)。這些圖表中最高強度的黑色代表0.9(圖6A為正值,原始為綠色)和-0.9(圖6B為負(fù)值,原始為紅色)的數(shù)值。圖7顯示了五.co/z'的密碼子偏向性圖譜。偏向性數(shù)值范圍從-0.97到0.85,而在其它生物中它們甚至可以稍高于+-0.9(也見圖3)。這些圖表中最高強度的黑色代表0.9(圖7A為正值,原始為綠色)和-0.9(圖7B為負(fù)值,原始為紅色)的數(shù)值。圖8顯示了與前文圖5-7類似的Amgw的479個高度轉(zhuǎn)錄的基因的密碼子偏向性圖譜。這些圖表中最高強度的黑色代表0.9(圖8A為正值,原始為綠色)和-0.9(圖8B為負(fù)值,原始為紅色)的數(shù)值。該組中最大的偏ii向性數(shù)值為-1,即一些可能的密碼子對完全不存在,盡管它們各自的密碼子和所編碼的氨基酸對存在。這可以是與全基因組中5,885,942個相比更小量的188,067個密碼子對的結(jié)果。然而,主要的原因應(yīng)當(dāng)是這類對的真實不足量表現(xiàn),其由高表達(dá)的基因中的選擇引起。圖9顯示了A中一組479個高表達(dá)的基因的偏向性(垂直軸)對所有基因偏向性(水平軸)的散布圖。不涉及終止密碼子的所有3,721個密碼子對被示出。從淺灰到黑的顏色根據(jù)整個基因組中z-評分的絕對值指定(即圖中的亮點在所有的基因中不具有顯著的偏向性),大小根據(jù)高表達(dá)的組中絕對z-評分指定,即非常小的點在其中不具有顯著的偏向性(此處|2-評分|<1.9)。黑色實線指出兩個偏向性數(shù)值相等的地方;虛線表示實際相關(guān)性的最佳線性近似(通過主成分分析確定);其斜率為2.1左右。圖10:與其轉(zhuǎn)錄水平的對數(shù)相比,4,584個Am'gw基因的適合度值。相關(guān)系數(shù)為-0.62。圖11顯示了單個密碼子優(yōu)化與密碼子對優(yōu)化相比。野生型(/^c(gFUA^0.165,^一(gFUA)-0.033)不適合該圖(其應(yīng)該在右上遠(yuǎn)處)。清楚的是C/7Z'參數(shù)確定單個密碼子和密碼子對適合度之間的權(quán)衡。最優(yōu)的基因始終是具有最低y^e和y^值的基因??紤]到點的位置仍然不清楚對哪個c;^值可獲得最好的基因,因為我們還不知道單個密碼子使用或密碼子對使用哪個更重要。盡管實施例表明除單個密碼子適合度以外密碼子對適合度也非常重要的有力證據(jù),這表示c聲應(yīng)當(dāng)被選擇為至少X)。圖12展示了兩幅圖,其顯示了上述FUA的(499個中)最初20個密碼子的序列品質(zhì)(也見實施例2)。黑點指出了期望的密碼子比例,而x-標(biāo)記顯示實際的密碼子比例(在整個基因中),二者通過虛線連接。然后單個密碼子適合度可以被解釋為這些虛線長度的平均值(注意對于位置4和5上預(yù)期的和實際的比例相等的密碼子例如TGG(其沒有同義密碼子)而言,該"長度"為零;還注意"長度"不能為負(fù))。黑色條反過來顯示兩個相鄰密碼子形成的對的權(quán)重。黑色點(中間,條下方)指出編碼相同12的二肽的任何密碼子對的最小權(quán)重。然后密碼子對適合度是這些條的平均高度(注意此處使用的高度可以適當(dāng)?shù)貫樨?fù))。圖13指出使用本發(fā)明所述的遺傳算法途徑用于優(yōu)化amyB基因的,c。油.的趨同(convergence),其得到SEQIDNO.6。圖14為了解釋指出單個密碼子分布圖表(如例如圖15中所示的分布圖表)的部分。兩幅圖指出編碼苯丙氨酸的兩個同義密碼子UUU(上圖)和UUC(下圖)的單個密碼子使用。兩幅圖的X軸和Y軸均從0%到100%?;疑狈綀D是一組250個高表達(dá)的A"/gw基因針對每個氨基酸(同義密碼子組)標(biāo)準(zhǔn)化的密碼子使用直方圖,其中所述基因包括在具有0%、>0畫<10%、10-<20%、…、90畫<100%、100%的組中。例如,50%的高表達(dá)的基因落入具有0%UUU密碼子使用和因此100%uuc密碼子使用以編碼苯丙氨酸的組中。白色條給出關(guān)于直方圖的相似庫(bin)中基因A(在該情況下為WTamyB)的密碼子使用;因此對基因A而言100。%在庫20-30%(20%是3/15的密碼子為UUU)中,因而100%在庫80-<90%(80%是12/15的密碼子為UUC)中。黑色條給出基因B(在該情況下為amyB的單個密碼子優(yōu)化的變體)的統(tǒng)計學(xué)??梢杂妙愃频姆绞疆a(chǎn)生16乘4的圖,以顯示所有64個密碼子的統(tǒng)計學(xué),見例如圖15。圖15(第1和第2部分)描述了單個密碼子優(yōu)化的amyB基因(黑色)與野生型amyB基因(白色)相比的三個密碼子頻率?;疑狈綀D描述了Am'ger中250個高表達(dá)的基因的統(tǒng)計學(xué)。顯而易見,某些密碼子如半胱氨酸(UGU/UGC)、組氨酸(CAU/CAC)、酪氨酸(UAU/UAC)和其它的密碼子被用來進(jìn)行了真正的改進(jìn)。圖16(第l和第2部分)描述了已經(jīng)關(guān)于單個密碼子和密碼子對二者被優(yōu)化的amyB基因(黑色)與野生型amyB(白色)相比的單個密碼子頻率?;疑狈綀D描述了中250個高表達(dá)的基因的統(tǒng)計學(xué)。顯而易見,這些圖高度類似圖15中描述的單個密碼子優(yōu)化的基因的情況。圖17描述了Am'ger的WTamyB基因單個密碼子和密碼子對統(tǒng)計學(xué)全圖(圖18)的部分。X軸上是在位置1上用起始密碼子ATG起始的基因中后續(xù)的密碼子。黑點"."指出該位置上的密碼子相對于其同義密碼子13的目標(biāo)單個密碼子比例。對ATG而言其為1.0(100%)。交叉"x"是所示基因中實際的密碼子比例;虛線顯示目標(biāo)比例與實際比例之間的差異。密碼子權(quán)重為-1和1之間的數(shù)值。條指出相鄰密碼子的實際密碼子對權(quán)重,而五角星指出最優(yōu)的能達(dá)到的同義密碼子對的權(quán)重(不考慮相鄰的對)。例如,第一條為-0.23,其為'ATG-GTC'的權(quán)重,第二條為0.66,其為'GTC-GCG'的權(quán)重。圖18描述了SEQIDNO.2(WTAmyB)的單個密碼子和密碼子對統(tǒng)計學(xué)。圖19描述了SEQIDNO.5(單個密碼子優(yōu)化的AmyB)的單個密碼子和密碼子對統(tǒng)計學(xué)。圖20描述了SEQIDNO.6(單個密碼子和密碼子對優(yōu)化的WTAmyB)的單個密碼子和密碼子對統(tǒng)計學(xué)。圖21描述了表達(dá)載體pGBFINFUA-1的質(zhì)粒圖譜。圖21還提供了質(zhì)粒pGBFINFUA-2和pGBFINFUA-3的代表性圖譜。所有的克隆來自pGBFIN-12(在W099/32617中描述)表達(dá)載體。其標(biāo)示出了相對于a/w;;B啟動子變體序列和編碼ce-淀粉酶的Am'geramyBcDNA序列的g/"A側(cè)翼區(qū)。可以在轉(zhuǎn)化菌株之前通過用限制性酶消化去除五.co/z'DNA。圖22描述了通過單次同源重組整合的圖示。表達(dá)載體包含選擇性amdS標(biāo)記物和與amyB基因相連的g/aA啟動子。這些特征側(cè)翼是g/aA基因座的同源區(qū)(分別為3'g/aA和3"g/aA),以指導(dǎo)在基因組g/aA基因座處的整合。圖23描述了表達(dá)三種不同構(gòu)建體的Am'gw菌株培養(yǎng)液中a淀粉酶的活性。描述了表達(dá)天然am;^構(gòu)建體的Am'gw菌株培養(yǎng)液中Q!淀粉酶的活性,其中(l)翻譯起始序列和翻譯終止序列被修飾(pGBFINFUA-l),和(2)翻譯起始序列、翻譯終止序列和單個密碼子使用被修飾(pGBFINFUA-2),和(3)翻譯起始序列、翻譯終止序列和單個密碼子使用與密碼子對使用被修飾(pGBFINFUA-3),所述修飾根據(jù)本發(fā)明的方法進(jìn)行。a-淀粉酶活性以相對單位[AU]表示,第4天FUA1組的10個菌株中6個單拷貝菌株的均值設(shè)定為100%。所示的每組10個轉(zhuǎn)化體是被獨立地分離和培養(yǎng)的轉(zhuǎn)化體。圖24(A和B)描述了5ac///^物種單個密碼子優(yōu)化的單個密碼子頻率。圖14給出了對小圖的解釋。灰色直方圖代表及^^"fo中50個最高表達(dá)的基因的密碼子分布,見文本。黑色條指出了目標(biāo)單個密碼子頻率。圖25描述了針對SEQIDNO.14(1/3)、SEQIDNO.17(2/3)和SEQIDNO.14(3/3)的單個密碼子和密碼子對統(tǒng)計學(xué),所述序列分別使用密碼子對+單個密碼子(1/3)、單個密碼子(2/3)和負(fù)密碼子對+單個密碼子優(yōu)化(3/3)被優(yōu)化。關(guān)于圖的解釋見圖17。圖26.E.co/^g"cz7/^穿梭載體pBHA-12。其中指出了多克隆位點(MCS)1禾口2。圖27.在五.co/z'/5fl"'〃w穿梭載體pBHA-12中克隆基因的例子。圖顯示了SEQIDNO.9被克隆的部分A和B(灰色箭頭)。其中指出了1A部分的克隆位點iVrfel和5amHI,1B部分的克隆位點S/wal和^wl。使用PvwII切除五.co/z'部分。發(fā)明詳述除單個密碼子偏向性外,核苷酸序列中的其它結(jié)構(gòu)也可能影響蛋白質(zhì)表達(dá),所述結(jié)構(gòu)例如二核苷酸或某些短核苷酸序列的重復(fù)(密碼子使用終究能夠按照讀碼框成直線以三核苷酸序列的模式重復(fù))。該工作呈現(xiàn)了鑒定某些密碼子對優(yōu)選級(preference)的方法,即密碼子是否像它們是根據(jù)所鑒定的密碼子使用比例被選擇然后隨機分布在基因(關(guān)于氨基酸序列)中那樣來出現(xiàn)在基因中,或者,是否一些密碼子更經(jīng)常緊鄰某些密碼子出現(xiàn)和更不經(jīng)常緊鄰其它密碼子出現(xiàn)。對密碼子對(codonpair)的分析還包括其它方面,即讀碼框周圍的二核苷酸使用,和對與密碼子緊鄰的某單個核苷酸可能的優(yōu)選級。本發(fā)明公開了針對給定的宿主生物來產(chǎn)生密碼子對偏向性表格的方法,藉此經(jīng)測序的全基因組的所有經(jīng)鑒定的ORF之任一被用作輸入值或選定的基因(例如高表達(dá)的基因)組。本發(fā)明公開了一種方法,其中如此鑒定的密碼子對偏15向性表格隨后被用于在感興趣的基因(GOI)中優(yōu)化密碼子對分布,以改善相應(yīng)的感興趣的蛋白質(zhì)(POI)的表達(dá)。單個密碼子優(yōu)化提供了改進(jìn)感興趣的蛋白質(zhì)的表達(dá)水平的一個良好起點。盡管其它人針對具有低豐度的tRNA插入tRNA基因的額外拷貝,嘗試通過宿主細(xì)胞的適應(yīng)克服由感興趣的基因中存在被拒絕的密碼子引起的缺點(例如StmtageneBL-21CodonPlusTM感受態(tài)細(xì)胞、NovagenRosettaTM宿主菌株,均為五.co/z'),但是本發(fā)明人專注于感興趣的基因自身的適應(yīng)。基因組中不想要的密碼子已被替換為同義密碼子,使得得到的序列的單個密碼子分布盡可能地接近先前鑒定的期望的密碼子比例。然而,這種密碼子調(diào)和仍然具有非常大量可能的基因是同樣"最適"的,因為優(yōu)化的基因中總體密碼子分布是選擇標(biāo)準(zhǔn),因此能夠容易地考慮密碼子序列的其它期望的特性,例如不存在某些酶的限制性位點或已知引起移碼的密碼子對。更進(jìn)一步,人們能夠?qū)⒚艽a子對使用優(yōu)化至有限的程度。但是當(dāng)(例如朝向最豐富的密碼子對使用)優(yōu)化基因的密碼子對時,得到的序列的單個密碼子使用可能不接近最適度,因為可能存在由不足量表現(xiàn)的單個密碼子組成的優(yōu)選的密碼子對,因此必須找到單個密碼子和密碼子對優(yōu)化之間的平衡。本發(fā)明公開了允許平衡單個密碼子和密碼子對優(yōu)化二者的方法。下述密碼子對優(yōu)化大幅地改進(jìn)了編碼感興趣的多肽的核苷酸序列的表達(dá)和/或改進(jìn)了所述多肽的生產(chǎn),所述密碼子對優(yōu)化考慮了密碼子對重疊和所述密碼子對優(yōu)化與單個密碼子優(yōu)化的任選組合。在本發(fā)明的上下文中,核苷酸編碼序列或編碼序列被定義為編碼多肽的核苷酸序列。編碼序列的邊界一般由位于mRNA5'端開放讀碼框開端的起始密碼子(在真核生物中通常為ATG,而在原核生物中其可以是ATG、CTG、GTG、TTG之一)和位于mRNA3'端開放讀碼框緊下游的終止密碼子(一般為TAA、TGA、TAG之一,盡管存在該"通用"編碼的例外)界定。編碼序列可包括但不限于DNA、cDNA、RNA和重組核酸(DNA、cDNA、RNA)序列(注意本領(lǐng)域公知在RNA中尿嘧啶U代替脫氧核苷酸胸腺嘧啶T)。如果編碼序列旨在用于在真核細(xì)胞中表達(dá),則多聚腺苷酸化信號和轉(zhuǎn)錄終止序列通常會位于編碼序列的3'。編碼序列包括翻16任選的信號序列,和任選的一條或多條內(nèi)含子序列。盡管術(shù)語"編碼序列"和"基因"嚴(yán)格地講不表示相同的實體,但是兩個術(shù)語在本文中頻繁地可交換地使用,技術(shù)人員會根據(jù)上下文明白該術(shù)語是表示全基因還是僅表示其編碼序列。用于單個密碼子和/或密碼子對適應(yīng)(codonpairadaptation)的方法和計算機設(shè)置對于高表達(dá)的基因的單個密碼子使用特性而言,在所有基因和一組高表達(dá)基因中的單個密碼子比例"手動"比較導(dǎo)致用于改善基因表達(dá)水平的一些"期望的密碼子比例"。然后可以如下進(jìn)行基因的單個密碼子適應(yīng)(l)計算基因中的實際比例,重復(fù)地(例如隨機地)挑出其期望的比例比實際比例低的密碼子并將其替換為具有過低比例的同義密碼子;或(2)使用"期望的密碼子比例"來計算每個密碼子的期望數(shù),制造多組同義密碼子,并重復(fù)地針對基因中的每個位置從編碼預(yù)定的氨基酸的同義組中(例如隨機地)挑出密碼子;使用方法(l)和/或(2)制造多重變體,并基于額外的選擇標(biāo)準(zhǔn)挑出最相關(guān)的基因(例如想要的和不想要的限制性位點和/或折疊能量)。然而該途徑不適合密碼子對適應(yīng),首先因為考慮到其復(fù)雜性,針對所有密碼子對的偏向性數(shù)據(jù)的視覺檢查是不可能的,其次因為改變一個密碼子對(這表示替換兩個參與的密碼子中至少一個)也會影響至少一個鄰近的密碼子對,因此"期望的密碼子對比例"將是不可獲得的。因為這帶來的限制,確定性途徑(deterministicapproach)被認(rèn)為過于復(fù)雜并且不夠有前途,然后選擇了"遺傳算法"途徑。觀察到術(shù)語"遺傳算法"在其似乎涉及遺傳工程的含義中可以是混淆的。然而,"遺傳算法"是來自計算機科學(xué)的途徑,其被用于多維優(yōu)化問題的近似解(Michalewicz,Z.,GeneticAlgorithms+DataStructure=EvolutionPrograms,SpringerVerlag1994;DavidE.Goldberg.GeneticAlgorithmsinSearch,OptimizationandMachineLearning.Addison-Wesley,ReadingMA,1989;http:〃en.wikipedia.org/wiki/Genetic_algorithm)。在本發(fā)明中,該途徑被用于解決選擇可能的"最佳"基因的優(yōu)化問題,所述"最佳"基因即感興趣的聚體蛋白質(zhì)的編碼序列。在該途徑中,基因中的每個位置(即每個密碼子)能夠被認(rèn)為是一維,其數(shù)值集合是離散的并通過可獲得的同義密碼子測定。一般而言,在遺傳算法中,首先通常隨機地、或者通過對最初提供的解變更產(chǎn)生該問題的可能的"解"集,(盡管存在許多其它方法途徑)。該集合被稱作"種群";其元件為"個體"或"染色體",主要由包含每一維坐標(biāo)的載體(數(shù)學(xué)含義上的載體)表示。因為遺傳算法是在涉及自然選擇的過程之后建模,所以許多術(shù)語來自于遺傳學(xué)。然而,因為它們(在該情況下不同)主要應(yīng)用于計算機科學(xué)的領(lǐng)域,但是也提出了遺傳算法在生物科學(xué)問題中應(yīng)用的一些例子,例如用于蛋白質(zhì)二級結(jié)構(gòu)預(yù)測(Armanoetal.2005BMCBioinformatics1(6)Suppl.4:S3);用于在計算機芯片上網(wǎng)絡(luò)優(yōu)化(Patiletal.2005BMCBioinformatics.23(6):308);基因表達(dá)數(shù)據(jù)聚類(DiGesuetal.2005BMCBioinformatics.7(6):289)。在該情況下,載體含有密碼子。通過改變已有個體的某些位置("突變")或通過將一個個體的部分(即某些坐標(biāo))與來自另一個體的另一部分(即另一維的坐標(biāo))組合("交換"),來從該種群產(chǎn)生新的個體。然后檢查這些個體有多良好(因為新的個體也可能解決最初的優(yōu)化問題),并將更好("最適合")的個體再次作為初始種群,用于產(chǎn)生新的個體("下一世代";例如保留最好的10%、20%、30%、40%、50%、60%,但是存在許多其它可能性,其選擇子集作為下一代用于獲得朝向更適合個體的趨同,例如轉(zhuǎn)輪選擇(roulettewheelselecting),見Michalewicz,Z,1994)。當(dāng)允許來自原始種群的最佳個體被帶入下一世代時,確保對每個種群而言可能的解的品質(zhì)變得更好或至少維持相同。然后假定,通過將該算法進(jìn)行許多世代(=迭代;數(shù)百到數(shù)千,取決于問題的復(fù)雜性),人們會得到接近最優(yōu)的解。在計算機科學(xué)中遺傳算法已被精密地研究,包括性能如種群大小和世代數(shù)量的最適比例、如何預(yù)防該算法陷入局部最優(yōu)等,但是這在本文中關(guān)系不大。如何針對實際的優(yōu)化步驟設(shè)定這些參數(shù)的信息,參見實施例2中MATLAB中遺傳算法完成的描述。這將參考圖2來詳細(xì)解釋。圖2顯示了用于基因優(yōu)化的遺傳算法的流程圖。這類遺傳算法可以在適當(dāng)編程的計算機上進(jìn)行,其例子將在圖1中展示并首先參考圖1解釋。圖1顯示了可以用于進(jìn)行本發(fā)明方法的計算機設(shè)置的綜述。該設(shè)置包括用于進(jìn)行算術(shù)運算(arithmeticoperation)的處理器1。應(yīng)當(dāng)注意,遺傳算法一般是非確定性的,因為它們涉及隨機化的步驟(例如隨機化的選擇標(biāo)準(zhǔn)和/或隨機化的算子選擇和/或隨機化的勢解(potentialsolution)產(chǎn)生),然而,存在以確定性方式進(jìn)行的例外。"遺傳算法"是用于下述算法的通用工具,所述算法處理一組(稱作種群)勢解,其通過使用一個或多個對象(objective)篩選和/或選擇和/或去除,和/或(再)引入趨向于最適解的(新)產(chǎn)生的解??紤]到該定義,被描述為進(jìn)化編程、進(jìn)化算法、經(jīng)典遺傳算法、實際編碼(real-coded)的遺傳算法、模擬退火、ant算法以及Monte-Carlo以及趨化性方法的方法也屬于類似的算法種類,與下述算法相反,所述算法基于使用確定性算法的單個勢解朝向最適解的趨同,例如線性編程和梯度算法。另外,技術(shù)人員從上下文會明白是否另一原始術(shù)語表示相同的算法種類。另外,盡管遺傳算法是優(yōu)選的方法,但是我們不排除遺傳算法以外用于解決本發(fā)明所述的單個密碼子和/或密碼子對優(yōu)化問題的任何其它方法。處理器1與大量存儲組件連接,所述存儲組件包括硬盤5、只讀存儲器(ROM)7、電可擦除可編程只讀存儲器(EEPROM)9和隨機存取存儲器(RAM)11。不是必須提供所有這些存儲器類型。另外,這些存儲組件不需要物理地位于接近處理器1處,而是可以位于遠(yuǎn)離處理器l處。處理器1還與使用者輸入指令、數(shù)據(jù)等的手段連接,所述手段如鍵盤13和鼠標(biāo)15。也可以提供本領(lǐng)域技術(shù)人員已知的其它輸入手段,如觸摸屏、軌跡球和/或聲音轉(zhuǎn)換器。本發(fā)明提供了與處理器1連接的讀取單元17。讀取單元17被設(shè)置為從數(shù)據(jù)運載體如軟盤19或CDROM21中讀取數(shù)據(jù)和可能地寫入數(shù)據(jù)。其它數(shù)據(jù)運載體可以是如本領(lǐng)域技術(shù)人員已知的磁帶、DVD、記憶棒等。處理器1還與用于打印輸出數(shù)據(jù)的打印機23和顯示器3(例如監(jiān)視器或LCD屏(液晶顯示器))或本領(lǐng)域技術(shù)人員已知的任何類型的顯示器相連。處理器1可以通過I/O設(shè)備25與通信網(wǎng)絡(luò)27連接,所述通信網(wǎng)絡(luò)27例如公共交換電話網(wǎng)絡(luò)(PSTN)、局域網(wǎng)(LAN)、廣域網(wǎng)(WAN)等。處理器1可以被設(shè)置為通過網(wǎng)絡(luò)27與其它通信設(shè)施通信。數(shù)據(jù)運載體19、21可包含數(shù)據(jù)和指令形式的計算機程序產(chǎn)物,其被設(shè)置為給處理器提供進(jìn)行本發(fā)明的方法的能力。然而,這類計算機程序產(chǎn)物或者可通過電信網(wǎng)絡(luò)27下載。處理器1可以作為獨立系統(tǒng)運行,或者作為多個平行操作處理器(每個被設(shè)置為完成更大計算機程序的子任務(wù))的大部分運行,或者作為具有若干個亞處理器的一個或多個主處理器運行。本發(fā)明的功能性部分甚至可以通過經(jīng)過網(wǎng)絡(luò)27與處理器1通信的遠(yuǎn)程處理器完成。現(xiàn)在將解釋圖2的算法,其可以在處理器1運行存儲于其存儲器中的計算機程序時在處理器1上進(jìn)行。在動作32中,計算機產(chǎn)生一個或多個編碼預(yù)定的蛋白質(zhì)的基因。這可以通過從計算機存儲器中存儲的表格中取出這類數(shù)據(jù)來完成該處理。這類基因可以例如是ATG,GTT,GCA,TGG,TGG,TCT,…ATG,GTA,GCA,TGG,TGG,TCA,…就算法的目的而言,這些產(chǎn)生的基因被稱作"原始基因"。在動作32之后,計算機程序通過將動作34-40進(jìn)行一次或多次而進(jìn)行一個或多個迭代循環(huán)(itemtionloop)。在動作34中,計算機程序通過將原始基因中的一個或多個密碼子替換為同義密碼子,從而使得新產(chǎn)生的基因仍然編碼預(yù)定的蛋白質(zhì)(交叉&突變過程)來產(chǎn)生新基因。為了能夠達(dá)到該目的,計算機的存儲器存儲了密碼子使用表格,該表格顯示哪個密碼子編碼哪個氨基酸。(注意如果是特定的宿主生物的情況,則存在并考慮與"通用密碼子"的偏差,見例如Laplazaetal"2006,EnzymeandMicrobialTechnology,38:741-747)。知道了蛋白質(zhì)中的氨基酸序列,計算機程序就能夠從如本領(lǐng)域所公知的表格中選擇備選的密碼子。使用動作32的例子,新產(chǎn)生的基因可以是(用粗體指出)-ATG,GTT,GCA,TGG,TGG,TCT,…ATG,GTA,GCA,TGG,TGG,TCA,…ATG,GTT,GCA,TGG,TGG,TCA,…ATG,GTA,GCA,TGGTGG,TCA,…ATG,GTA,GCC,TGG,TGG,TCA,...在動作36中,通過計算機程序使用適合度函數(shù)來測定所有基因的品質(zhì)數(shù)值,所述適合度函數(shù)至少測定密碼子適合度和密碼子對適合度之一,所述所有基因包括原始基因和新產(chǎn)生的基因。這類適合度函數(shù)將在下文"進(jìn)行密碼子對優(yōu)化"部分中詳細(xì)解釋。在動作38中,選擇大量的根據(jù)適合度函數(shù)顯示最佳適合度的基因參與"育種過程"(交叉和突變),并且選擇大量的根據(jù)適合度函數(shù)顯示最差適合度的基因從種群中去除。這些數(shù)量可以是預(yù)定的數(shù)量或取決于預(yù)定的適合度改進(jìn)量。這些基因的選擇應(yīng)當(dāng)是確定性的,但是當(dāng)具有更高改變的"最適基因"被選擇用于育種,而相反的被選擇從種群中刪除時,一般隨后是隨機過程。該方法被稱作轉(zhuǎn)輪選擇。得到的被選擇用于育種的基因可以例如是(非選擇的基因用刪除線表示)ATG,GTT,GCA,TGG,TGG,TCT,…ATG,GTT,GCA,TGG,TGG,TCA,…ATG,GTA,GCC,TGG,TGG,TCA,…在動作40中,計算機程序測試一個或多個終端標(biāo)準(zhǔn)是否被滿足。通常終端標(biāo)準(zhǔn)之一為預(yù)定的迭代最大值。備選的標(biāo)準(zhǔn)是檢驗相對于原始基因的適合度而言,通過選擇的基因獲得的適合度是否被改進(jìn)了至少最小的閾值;或檢驗相對于n迭代之前(優(yōu)選地n為<10,100中選擇的數(shù)值)具21有最佳適合度的基因的適合度而言,通過選擇的基因獲得的適合度是否被改進(jìn)了至少最小的閾值。如果總體終端標(biāo)準(zhǔn)沒有被滿足,則計算機程序跳回動作34同時將選擇的基因作為"原始基因"處理。如果在動作40中,計算機程序確立了改進(jìn)低于最小閾值,則動作34-38沒有很大意義,且計算機程序用動作42繼續(xù)。應(yīng)當(dāng)理解,可在動作40中使用任何其它合適的迭代終止標(biāo)準(zhǔn)(如進(jìn)行的迭代數(shù))以留下迭代動作34-40并用動作42繼續(xù)。在動作42中,在所有選擇的基因中選擇具有最佳適合度的基因,并例如通過監(jiān)視器或借助于打印機通過打印輸出提交給用戶。在使用遺傳算法的基因適應(yīng)的情況下,必須確保交叉總是在讀碼框位置進(jìn)行,因為否則的話得到的氨基酸序列可能被改變,組合了一個密碼子的一個核苷酸與另一密碼子的兩個核苷酸。為了更好的趨同提出了經(jīng)修飾的突變算子,對該突變算子而言只有同義密碼子置換被允許,所述同義密碼子置換導(dǎo)致至少最優(yōu)的單個密碼子使用或最優(yōu)的密碼子對使用之一。因此,目前密碼子對優(yōu)化的一個重要問題是如何測量個體的品質(zhì)。該所謂的適合度函數(shù)可以被認(rèn)為是遺傳算法的中樞部分,因為它是要被優(yōu)化的實際函數(shù)。在本發(fā)明中,一種優(yōu)選的途徑是對每個密碼子對指定一個實數(shù)(稱作權(quán)重)并在基因中利用該權(quán)重作為其"適合度",從而導(dǎo)致要被最小化的函數(shù)。在本說明書中,發(fā)明人將基因優(yōu)化的過程描述為最小化問題。這是一個相當(dāng)隨意(arbitrarily)的途徑。應(yīng)注意,如果函數(shù)/要被最大化,人們能夠同樣尋找-/的最小值,因此這并非對一般性(generality)的限制。因此,必須鑒定測定密碼子對權(quán)重的方法,其中被認(rèn)為對表達(dá)水平有益的密碼子對具有低權(quán)重,被認(rèn)為對表達(dá)水平有害的密碼子對具有高權(quán)重。用于基因適應(yīng)的密碼子對權(quán)重鑒定為了鑒定下述密碼子對權(quán)重(其與更高的轉(zhuǎn)錄/表達(dá)水平相關(guān)并可作為輸入值用于密碼子對使用的適應(yīng)),可以應(yīng)用以下的方法,所述方法在本22文中通過Am'ger(其大部分被表達(dá)的基因的轉(zhuǎn)錄水平是已知的)和5.strife(可獲得其轉(zhuǎn)錄水平的數(shù)據(jù)以及一組300個高表達(dá)的基因)例證。在Am'gw中(其中可以獲得上述一組4,584個實際表達(dá)的基因的、從GeneChip數(shù)據(jù)中提取的完全分級(見實施例1)),計算每個基因的平均密碼子對權(quán)重(即刀^(g)值的等價物)。然后根據(jù)適合度值(升序排列)和表達(dá)水平(降序排列)將基因分類。因為高表達(dá)的基因被假定為具有低密碼子對適合度值,所以當(dāng)使用理想的密碼子對權(quán)重時這兩種分級應(yīng)當(dāng)是相等的,因此,這兩種分級的比較可以給出與適合度函數(shù)中使用的權(quán)重的品質(zhì)相關(guān)的信息(其中與普通基因的分級相比,給予高表達(dá)的基因的"正確"分級稍微更多的注意力)。因此,計算了4,584個基因的分級和平均密碼子對權(quán)重之間的相關(guān)系數(shù)(每個變量的協(xié)方差除以標(biāo)準(zhǔn)差)。在本發(fā)明的方法中可以使用若干個可能的權(quán)重組,包括選自下組的一個或多個(i)來自全基因組的偏向性值;(ii)來自一組高表達(dá)的基因的偏向性值;(iii)具有所有下述值的偏向性,所述值不具有調(diào)整到零的確定的最小z-分?jǐn)?shù)(其中z-分?jǐn)?shù)如實施例l丄4中所述被測定);(iv)上升至2或3、4、5或更高的冪次(power)的偏向性值(以給予高度優(yōu)選的或拒絕的密碼子更低/更高的影響);(v)z-分?jǐn)?shù)自身;(vi)來自高表達(dá)的組和全基因組的偏向性數(shù)值/z-分?jǐn)?shù)的差異;和(vii)(i)-(vi)中一個或多個的組合。對遺傳算法而言,它們的求反(negation)已被使用,因為己用正值隨意地識別優(yōu)選的密碼子對,然而遺傳算法進(jìn)行最小化。這適用于所有上述權(quán)重。可以如上所述,使用根據(jù)全基因組的密碼子比值計算的期望值通過計算高表達(dá)的組中的密碼子對"偏向性"獲得更優(yōu)選的權(quán)重矩陣?!?(q)仍然表示全基因組數(shù)據(jù)集合中^的單個密碼子比例,"T((e^))表示高表達(dá)的組中對(c,,。)的發(fā)生,然后"組合的期望值""、,.,:))的i"f算對應(yīng)于",((c,.,c,))^/(c,.)《"(c》.Z從而23,。'—m一n。)d。)))°其中vv((^。))被定義為密碼子的序列g(shù)中密碼子對(c,,。)的權(quán)重。注意因為優(yōu)化函數(shù)會尋找最小的平均權(quán)重,所以與針對偏向性值的方程相比兩個分子術(shù)語被反轉(zhuǎn),但是除了改變標(biāo)志以外這不影響與表達(dá)水平的相關(guān)性。與測試的其它權(quán)重集合不同,下述密碼子對在本文中略微具有缺點,所述密碼子對涉及在高表達(dá)的組中更不足量表現(xiàn)的密碼子。因此,這些權(quán)重是僅有的也反映高表達(dá)的組和所有基因的不同單個密碼子偏向性的權(quán)重。使用這些權(quán)重帶有拒絕下述一些密碼子對的風(fēng)險,所述密碼子對實際上在高表達(dá)的組中具有正偏向性,但是由(高表達(dá)的組中)很少使用的密碼子組成。然而,因為我們期望的單個密碼子比例通常與具有高表達(dá)的基因組中的單個密碼子比例并不相同,而是比它們更加"極端",單個密碼子優(yōu)化無論如何會代替這些不足量表現(xiàn)的密碼子,因此我們能夠認(rèn)為上文所述的權(quán)重對于密碼子對優(yōu)化是非常方便的。因此,盡管密碼子對權(quán)重也在有限的程度上反映了單個密碼子偏向性,但是對優(yōu)化而言,單個密碼子使用被認(rèn)為是獨立的、另外的問題。使用遺傳算法對單個密碼子和密碼子對的優(yōu)化在本發(fā)明的方法中,優(yōu)選使用被程序化以進(jìn)行如上所述的遺傳算法的計算機設(shè)置,來進(jìn)行密碼子對適應(yīng),或進(jìn)行了組合的單個密碼子和密碼子對適應(yīng)。將遺傳算法應(yīng)用于單個密碼子適應(yīng)也是可能的并且不從本發(fā)明中被排除,但是此處不想要的密碼子可以被替換為同義密碼子而不受相鄰密碼子的約束,因此使用遺傳算法不是真正必需的。對密碼子對而言,改變單個密碼子通常會改變兩個密碼子對的權(quán)重,因此密碼子對優(yōu)化是受嚴(yán)重限制的,因為替換不想要的密碼子對的單個密碼子改變總是會改變另一密碼子對,這不一定是針對更好的改變,隨后在鄰近的密碼子對中改正針對更差的改變會再次改變另一對,依此類推。對于突變算子而言,只有密碼子序列的下述變更被允許,所述變更不改變所編碼的肽序列并且至少改進(jìn)單個密碼子適合度和密碼子對適合度之一,即在改變之前,突變算子尋找下述同義密碼子,其是不足量表現(xiàn)的(根據(jù)期望的單個密碼子比例)或者是其所涉及的兩個密碼子對具有更好的權(quán)重的密碼子。隨機選擇進(jìn)行兩種類型突變中的哪一種。在每個單個密碼子上進(jìn)行前一"突變"算子足夠創(chuàng)建單個密碼子優(yōu)化的基因,而不使用任何遺傳算法。考慮兩個方面來測定基因的品質(zhì),即單個密碼子"適合度"和密碼子對"適合度"。后者簡單地是密碼子序列g(shù)(或基因)中所有密碼子對的權(quán)重w((c(y^C("7))的均值。即當(dāng)g再次表示密碼子序列時,lgl為其長度(以密碼子表示)而c(0為其A:-th密碼子1lgl-lA,(力=H"T'2w((卓),+Ws。單個密碼子適合度被定義為基因中實際密碼子比例和目標(biāo)密碼子比例的差異,其針對每個密碼子的出現(xiàn)數(shù)被標(biāo)準(zhǔn)化。單個密碼子比例在本文的實施例l丄2中定義并如其中所述被測定。《w(c0t))是密碼子Q的期望的比例(或頻率),而rj(c^))如前文是基因g中的實際比例,然后單個密碼子適合度被定義為Ac(g)=A£k,'(c(")—《(cW)l。因此,無c能夠達(dá)到中的數(shù)值,其中最優(yōu)的序列接近0,而無p受權(quán)重限制,所述權(quán)重在本文中也在[-l,l]中。為了針對兩個方面進(jìn)行優(yōu)化,在一個實施方案中引入了組合的適合度函數(shù)qw+艮(g)其中代表"密碼子對重要性"的是大于零的實數(shù),并且決定兩個適合度函數(shù)中的哪一個對組合適合度具有更大的影響。C戸'接近零時,當(dāng)?shù)?c(g)更好時(即也接近零)分母接近零,因此/^c(g)中的小改變比^^(g)中的小改變更多地影響y^。^(g),然而在高c戸'下,y^p(g)的輕微改進(jìn)比起無c(g)的中度改進(jìn)可對Ac。油(g)具有更大的作用。注意,使用不同的C^'值獲得的y^。油值是不能比較的(W接近于0可導(dǎo)致_/^。威接近于-100,然而對C^>0.2而言力UnW—般在0和-1之間)。在一個實施方案中,如果g含有某些不想要的序列(例如限制性位點或?qū)е耺RNA中不想要的二級結(jié)構(gòu)的序列),則添加"罰分(penalty)"。構(gòu)建合成基因時這可以是有用的,但是其自身與單個密碼子和密碼子對使用的優(yōu)化無關(guān)。經(jīng)修飾的適合度函數(shù)成為聲二,(力=二:、+尸(g),其中尸(g)表示罰分函數(shù)(penaltyfonction),其在不想要的序列結(jié)構(gòu)是基因g的部分的情況下創(chuàng)建了正權(quán)重。應(yīng)當(dāng)理解,在本發(fā)明的實施方案中,核苷酸和氨基酸序列可以是理論序列,其僅存在于例如紙上或另一優(yōu)選的計算機可讀數(shù)據(jù)運載體上,或它們可以作為切實地、物理地創(chuàng)建的具體化身存在。因此本發(fā)明第一方面涉及優(yōu)化編碼預(yù)定的氨基酸序列的核苷酸序列的方法,其中所述編碼序列針對在預(yù)定的宿主細(xì)胞中的表達(dá)被優(yōu)化。該方法優(yōu)選地包括下述步驟(a)產(chǎn)生至少一條編碼預(yù)定的氨基酸序列的原始編碼序列;(b)通過用同義密碼子替換該至少一條原始編碼序列中的一個或多個密碼子,從該至少一條原始編碼序列產(chǎn)生至少一條新產(chǎn)生的編碼序列;(c)測定所述至少一條原始編碼序列的適合度值和所述至少一條新產(chǎn)生的編碼序列的適合度值同時使用下述適合度函數(shù),所述函數(shù)針對預(yù)定的宿主細(xì)胞至少測定單個密碼子適合度和密碼子對適合度之一;(d)根據(jù)預(yù)定的選擇標(biāo)準(zhǔn),在所述至少一條原始基因和所述至少一條新產(chǎn)生的編碼序列中選擇一條或多條選定的編碼序列,所述適合度值越高,被選擇的機會越高;和(e)重復(fù)動作b)到d),同時在動作b)到d)中將所述一條或多條選定的編碼序列作為一條或多條原始編碼序列處理,直至滿足預(yù)定的迭代終止標(biāo)準(zhǔn)。根據(jù)本發(fā)明的一個實施方案,該方法優(yōu)選地包括步驟(a)產(chǎn)生至少一條編碼預(yù)定的氨基酸序列的原始編碼序列;(b)通過用同義密碼子替換該至少一條原始編碼序列中的一個或多個密碼子,從該至少一條原始編碼序列26產(chǎn)生至少一條新產(chǎn)生的編碼序列;(C)測定所述至少一條原始編碼序列的適合度值和所述至少一條新產(chǎn)生的編碼序列的適合度值同時使用下述適合度函數(shù),所述函數(shù)針對預(yù)定的宿主細(xì)胞測定密碼子對適合度;(d)根據(jù)預(yù)定的選擇標(biāo)準(zhǔn),在所述至少一條原始基因和所述至少一條新產(chǎn)生的編碼序列中選擇一條或多條選定的編碼序列,所述適合度值越高,被選擇的機會越高;和(e)重復(fù)動作b)到d),同時在動作b)到d)中將所述一條或多條選定的編碼序列作為一條或多條原始編碼序列處理,直至滿足預(yù)定的迭代終止標(biāo)準(zhǔn)。根據(jù)本發(fā)明的另一實施方案,該方法優(yōu)選地包括下述步驟(a)產(chǎn)生至少一條編碼預(yù)定的氨基酸序列的原始編碼序列;(b)通過用同義密碼子替換該至少一條原始編碼序列中的一個或多個密碼子,從該至少一條原始編碼序列產(chǎn)生至少一條新產(chǎn)生的編碼序列;(c)測定所述至少一條原始編碼序列的適合度值和所述至少一條新產(chǎn)生的編碼序列的適合度值同時使用下述適合度函數(shù),所述函數(shù)包括針對預(yù)定的宿主細(xì)胞測定單個密碼子適合度和密碼子對適合度;(d)根據(jù)預(yù)定的選擇標(biāo)準(zhǔn),在所述至少一條原始基因和所述至少一條新產(chǎn)生的編碼序列中選擇一條或多條選定的編碼序列,所述適合度值越高,被選擇的機會越高;和(e)重復(fù)動作b)到d),同時在動作b)到d)中將所述一條或多條選定的編碼序列作為一條或多條原始編碼序列處理,直至滿足預(yù)定的迭代終止標(biāo)準(zhǔn)。在所述方法中,優(yōu)選地,預(yù)定的選擇標(biāo)準(zhǔn)是所述一條或多條選擇的編碼序列具有根據(jù)預(yù)定的標(biāo)準(zhǔn)最優(yōu)的適合度值。根據(jù)本發(fā)明的方法還可在動作e)之后包括在所述一條或多條選定的編碼序列中選擇最佳個體編碼序列,其中所述最佳個體編碼序列具有比其它選定的編碼序列更好的適合度值。在本發(fā)明的方法中,其中所述預(yù)定的迭代終止標(biāo)準(zhǔn)優(yōu)選地至少為下述之一(a)測試是否至少一條所述選定的編碼序列具有高于預(yù)定的閾值的最佳適合度值;(b)測試是否所述選定的編碼序列均不具有低于所述預(yù)定的閾值的最佳適合度值;(c)測試是否至少一條所述選定的編碼序列在所述原始編碼序列中有至少30%的對預(yù)定的宿主細(xì)胞而言具有相關(guān)正密碼子對權(quán)重的密碼子對被轉(zhuǎn)化為具有相關(guān)負(fù)權(quán)重的密碼子對;和(d)測試是否至少一條所述選定的編碼序列在所述原始編碼序列中有至少10%、20%、30%、40%、50%、60%、70%、80%或90%的對預(yù)定的宿主細(xì)胞而言具有高于0的相關(guān)正權(quán)重的密碼子對被轉(zhuǎn)化為具有低于0的相關(guān)權(quán)重的密碼子對。在本發(fā)明的方法中,適合度函數(shù)優(yōu)選地借助于A(g)=1。0-A,Ek,(c柳-廠/(順)來定義單個密碼子適合度,其中g(shù)象征編碼序列,lgl為長度,g(Q為k-th密碼子,^'^'(c(yt))是期望的密碼子c^)比值(附錄2;CR載體),且r/(c(")為核苷酸編碼序列g(shù)中的實際比值。在本發(fā)明的方法中,適合度函數(shù)優(yōu)選地借助于A(g)=rfr.Sw(,,+1))定義密碼子對適合度,其中w((c(Q,C(A+l))是編碼序列g(shù)中密碼子對的權(quán)重,lgl為所述核苷酸編碼序列的長度,且C(Q為所述編碼序列g(shù)中的k-th密碼子。更優(yōu)選地,在本發(fā)明的方法中,適合度函數(shù)借助于定義,其中艮(g)=rV2卜"'(柳-^(柳l水cW是大于零的真實值,y^p(g)是密碼子對適合度函數(shù),刀4c(g)是單密碼子適合度函數(shù),w((c附+i))是編碼序列g(shù)中密碼子對的權(quán)重(附錄3,CPW矩陣),lgl是所述編碼序列的長度,c(Q是所述密碼子序列中的k-th密碼子,CcW)是期望的密碼子C的比例,《(c(")是編碼序列g(shù)中的實際比例。優(yōu)選地c戸'在0和10之間,更優(yōu)選地在0和0.5之間,最優(yōu)選地為約0.2。28在本發(fā)明的方法中,密碼子對權(quán)重W(附錄3)可以來自包含終止密碼子的61x64密碼子對矩陣。注意終止:正義對和終止:終止對的權(quán)重總是零。優(yōu)選地根據(jù)基于計算機的方法,使用至少以下之一作為輸入值來計算密碼子對權(quán)重w:(a)預(yù)定的宿主細(xì)胞的基因組序列,其中至少5%、10%、20%或80%的蛋白質(zhì)編碼核苷酸序列被測序;(b)預(yù)定的宿主細(xì)胞的相關(guān)物種的基因組序列,其中至少5%、10%、20%或80%的蛋白質(zhì)編碼核苷酸序列被測序;(c)由預(yù)定的宿主細(xì)胞的至少200個編碼序列組成的一組核苷酸序列;和(d)由預(yù)定的宿主細(xì)胞的相關(guān)物種的至少200個編碼序列組成的一組核苷酸序列。相關(guān)物種在本文中應(yīng)當(dāng)被理解為表示下述物種,其小亞基核糖體RNA的核苷酸序列與預(yù)定的宿主細(xì)胞的小亞基核糖體RNA的核苷酸序列具有至少60%、70%、80%或90%的同一性(Wuytsetal.,2004,NucleicAcidsRes.12:D101-D103)。不需要針對所有可能的61x64密碼子對(其包含作為終止密碼子的終止信號)來測定密碼子對權(quán)重w,而是可以僅針對其片段來測定,例如至少5%、10%、20%、50%和優(yōu)選地100%的包含作為終止密碼子的終止信號的可能的61x64密碼子對。選擇高表達(dá)的基因為了計算密碼子對權(quán)重矩陣和單個密碼子目標(biāo)比例載體,人們可以應(yīng)用來自特定的宿主細(xì)胞自身的核苷酸序列集合、來自相關(guān)物種的核苷酸序列集合,或二者的組合。核苷酸序列的A集合被稱為"全體參考集合(referencesetall)"。最優(yōu)選地,該集合含有被完全測序(>95%)的生物的開放讀碼框(ORF)的全集。在本發(fā)明一個優(yōu)選的實施方案中,子集B被選擇為含有下述子集,所述子集過量表現(xiàn)高表達(dá)的基因或編碼高表達(dá)的蛋白質(zhì)的基因。該集合可以使用下述測量方法和隨后的分級測定,所述測量方法如使用陣列(例如來自Affymetrix、Nimblegen、Agilent或參考集合A的任何其它來源的陣列)技術(shù)的mRNA雜交。其它測量方法可以是RT-PCR、蛋白質(zhì)凝膠、MS-MS分析或本領(lǐng)域技術(shù)人員已知的任何其它測量技術(shù)。除了在測量方法的基礎(chǔ)上進(jìn)行分級以外,還可以應(yīng)用生物信息學(xué)工具來直接預(yù)測一組高29表達(dá)的基因,例如通過選擇最偏向的基因(Carboneetal,2003)或通過選擇已知在大范圍生物中高表達(dá)的基因來預(yù)測。其中包括涉及初級代謝的核糖體蛋白質(zhì)、糖酵解和TCA循環(huán)基因,涉及轉(zhuǎn)錄和翻譯的基因。優(yōu)選地,根據(jù)基于計算機的方法,使用在預(yù)定的宿主細(xì)胞中高表達(dá)的基因組作為輸入值計算密碼子對權(quán)重w。高表達(dá)的基因在本文中被理解為表示下述基因,所述基因的mRNA可以以每個細(xì)胞至少IO個、優(yōu)選地20個、更優(yōu)選地50個、更優(yōu)選地100個、更優(yōu)選地500個和最優(yōu)選地至少1,000個拷貝的水平被檢測。例如,Gygietal.測量了每個酵母細(xì)胞15,000mRNA分子。特定的mRNA分子的豐度被測定為在每個細(xì)胞0.1-470的范圍內(nèi)(Gygi,S.P.,Y.Rochon,B.R.FranzaandR.Aebersold(1999).CorrelationbetweenproteinandmRNAabundanceinyeast.Mol.Cel.Biol.19(3):1720-30)或低10倍的范圍內(nèi)每個細(xì)胞0.01-50(byAkashi,H.(2003).Translationalselectionandyeastproteomeevolution.Genetics164(4):1291-1303.)?;蛘?,預(yù)定的宿主細(xì)胞中高表達(dá)基因的組可以是包含1000、500、400、300或200或100個最大量的mRNA或蛋白質(zhì)的組。技術(shù)人員應(yīng)當(dāng)明白為了計算單個密碼子比例,高表達(dá)基因的組大小可以是小的,因為至多僅64個目標(biāo)值被明確。此時具有高表達(dá)基因的參考集合可以低至1個基因,但是通常認(rèn)為1%的基因組大小是高表達(dá)基因的代表性集合,見例如Carbone,A.etal.(2003)(Codonadaptationindexasameasureofdominatingcodonbias.Bioinformatics.19(16):2005-15)。對密碼子對權(quán)重矩陣的計算而言,通常200-500個參考基因的集合滿足條件,其對應(yīng)于細(xì)菌基因組(3000-15000個基因)的2-7%。另一種可能性是從文獻(xiàn)中得出推定高表達(dá)的基因的子集。例如對于模式生物Sa"7/^wMto而言,存在許多針對單個密碼子偏向性的文獻(xiàn)。針對石acz7/usswZ^fo的本領(lǐng)域水平的良好綜述由Kanaya"a/.(1999)的工作給出。在我們的途徑中(見實施例4),我們根據(jù)通過Affymetrix技術(shù)測量的mRNA水平將數(shù)據(jù)歸類在高表達(dá)組的子集中,并將這些序列與基因組ORF的全集比較。在文獻(xiàn)中已經(jīng)使用的其它選擇是蛋白質(zhì)表達(dá)數(shù)據(jù),和功能性范疇的(預(yù)期的)基因組,所述基因如核糖體蛋白質(zhì)、涉及翻譯和轉(zhuǎn)錄、孢子形成、能量代謝和鞭毛體系的蛋白質(zhì)(Kanayaetal.,1999;KarlinandMrazek,2000)。事實上人們常發(fā)現(xiàn)例如核糖體蛋白質(zhì)以及其它名稱的組中的高密碼子偏向性。然而,一般地后一組中并非所有基因均顯示這樣的表現(xiàn)。我們也不知道核糖體蛋白質(zhì)如何在低生長生產(chǎn)條件下反應(yīng)。因此,推出高表達(dá)基因子集的直接測量技術(shù)似乎是合理的。然后我們可以選擇轉(zhuǎn)錄體組(TX)和/或蛋白質(zhì)組(PX)數(shù)據(jù)。對二者而言均存在pros和cons。TX給出全基因組中基因mRNA水平的相當(dāng)全面的圖譜,而PX數(shù)據(jù)可能由水溶性蛋白質(zhì)的過量表現(xiàn)引起偏向。TX數(shù)據(jù)是可獲得的進(jìn)行翻譯的mRNA的直接度量,而蛋白質(zhì)是蓄積過程的部分,其中周轉(zhuǎn)(tumover)也起到重要作用。總之,TX和PX數(shù)據(jù)顯示與高表達(dá)的基因相關(guān)(Gygietal,1999)。另一個有趣的工作是通過與平均密碼子使用的偏差和與下述蛋白質(zhì)的相似性來預(yù)測高表達(dá)的(PHX)基因,所述蛋白質(zhì)為核糖體蛋白質(zhì)和涉及翻譯和轉(zhuǎn)錄加工因子的蛋白質(zhì),和陪伴分子降解蛋白質(zhì)(KarlinandMrazek,2000)。尤其是對于快速生長的生物如^^7/w、五.co/z'等而言,發(fā)現(xiàn)主要的糖酵解基因和三羧酸循環(huán)基因?qū)儆谏鲜鼋M。該預(yù)測方法在mRNA數(shù)據(jù)和蛋白質(zhì)表達(dá)上與已知高表達(dá)的基因良好地比較。技術(shù)人員應(yīng)當(dāng)明白,單個密碼子權(quán)重和密碼子對權(quán)重w二者均可針對經(jīng)修飾的宿主細(xì)胞被測定,所述宿主細(xì)胞關(guān)于其tRNA編碼基因的含量和性質(zhì)被修飾,即包含存在的tRNA基因的額外拷貝、下述新(外源)tRNA基因(包括非天然的tRNA基因)的宿主細(xì)胞,以及其中一個或多個tRNA基因被失活或刪除的宿主細(xì)胞,所述非天然的tRNA基因包括編碼tuna的基因,所述tima被修飾為包括非天然的氨基酸或其它化合物。在本發(fā)明的方法中,編碼預(yù)定的氨基酸序列的原始編碼核苷酸序列選自(a)編碼預(yù)定的氨基酸序列的野生型核苷酸序列;(b)預(yù)定的氨基酸序列的逆翻譯,其中預(yù)定的氨基酸序列中氨基酸位置上的密碼子隨機地選自編碼該氨基酸的同義密碼子;和(c)預(yù)定的氨基酸序列的逆翻譯,其中根據(jù)預(yù)31定的宿主或與所述宿主細(xì)胞相關(guān)的物種的單個密碼子偏向性來選擇預(yù)定的氨基酸序列中氨基酸位置上的密碼子。宿主細(xì)胞在本發(fā)明的方法中,預(yù)定的宿主細(xì)胞可以是適用于通過表達(dá)被優(yōu)化的核苷酸編碼序列來生產(chǎn)感興趣的多肽的任何宿主細(xì)胞或生物。因此宿主細(xì)胞可以是原核或真核的宿主細(xì)胞。宿主細(xì)胞可以是適用于在液體或固體培關(guān)宜rh+立關(guān)66fe+如胎i姿fe+袖胎TiTI、1且"IC^細(xì)胞苴^b玄細(xì)跑細(xì)織或/和多細(xì)胞生物如(轉(zhuǎn)基因)植物、動物或人的部分。宿主細(xì)胞可以是微生物或非微生物的。合適的非微生物宿主細(xì)胞包括例如哺乳動物宿主細(xì)胞如倉鼠細(xì)胞CHO(中國倉鼠卵巢)、BHK(幼倉鼠腎)細(xì)胞、小鼠細(xì)胞(例如NS0)、猴細(xì)胞如COS或Vero;人細(xì)胞如PER,C6tm或HEK-293細(xì)胞;或昆蟲細(xì)胞如DrosophilaS2和SpodopteraSf9或Sf21細(xì)胞;或植物細(xì)胞如煙草、番茄、馬鈴薯、歐洲油菜(oilseedrape)、巻心菜、豌豆、小麥、玉米、水稻、7b:a^物種如7Vm^Zrev^/z'a、JraZu'fifo/w^物禾中如Jra&Jo/w^sAa/zVma,禾口7V/'co/zVma物禾中如Mcori朋atoZ^a/m。這類非微生物的細(xì)胞尤其適用于生產(chǎn)用于哺乳動物或人治療中的哺乳動物或人蛋白質(zhì)。宿主細(xì)胞也可以是微生物宿主細(xì)胞如細(xì)菌或真菌細(xì)胞。合適的細(xì)菌宿主細(xì)胞包括革蘭氏陽性和革蘭氏陰性細(xì)菌。合適的細(xì)菌宿主細(xì)胞包括來自于5ac/〃i/s、Jc">iom_yce^s、五sc/en'c/n'fl、5Vre//o附少ces屬的細(xì)菌,以及字L酸菌如丄actoZjflcz'〃船、5V/^ptococcw51、丄actococctw、(9ewococcMs、Z^t/cowoWoc、尸Wococcw、Ca/7ZoZacten'wm、尸ra/z'om'Zac^7'訓(xùn)、^Vz^"ococcws禾口5zy^o6acfen'wm。尤其優(yōu)選的是5ac/〃ws仰ZZ/fc、5ac/〃wscoe/z'co/or、5Vre/tomycesc/avw/z'geras禾口丄acto6a".〃1^//awtan/m、或者,宿主細(xì)胞可以是真核微生物如酵母或絲狀真菌。優(yōu)選的作為宿主細(xì)胞的酵母屬于iSflcc/jaramycas、A7i/_yv£ram_ycas、Ca涵Wa、尸ZcAz.g、32的屬。尤其優(yōu)選的Z)eZarom;;w宿主細(xì)胞包括Sacc/aram少cascere由'oe禾口根據(jù)一個更優(yōu)選的實施方案,本發(fā)明的宿主細(xì)胞是絲狀真菌的細(xì)胞。"絲狀真菌"包括真菌門和卵菌門(Oomycota)亞類的所有絲狀形式(如上文Hawksworthetal.、1995定義)。絲狀真菌的特征是由甲殼質(zhì)、纖維素、葡聚糖、殼聚糖、甘露聚糖和其它復(fù)合多糖組成的菌絲體壁。植物性生長通過菌絲延長進(jìn)行,碳分解代謝是專性需氧的。在本發(fā)明中其菌株可以用作宿主細(xì)胞的絲狀真菌屬包括,但不限于下述屬的菌株Fwsan'謹(jǐn)、//w附Zco/a、MagwaporAe、Mwcor、Afyce/z.o/滅ora、iVeocfl〃z'mas^'x、7Vei/ro5^ora、尸aecz7om少ces、尸ewz'cz'〃z'wm、尸zVom少ces、5^/'20/^少//"附、C7/7sos/on'w附、7"a/arom少cas1、TT^r/woascus、77'e/av/a、7b/少;oc/a^/m禾卩7WcAo&ma。優(yōu)選地,絲狀真菌屬于選自下組的物種,所述組由J5pergi〃twWger、v45^erg/〃wso,ae、^45^ergz'〃ws呵'ae、7Wc/06fe/7wa或尸em'cz'〃z.wmc/j^sogewwm組成。合適的宿主菌株的例子包括^sperg"/^CBS513.88(Peletal.,2007,NatBiotech.21:221-231)、^sp^^〃Mo^yzaeATCC20423、IFO4177、ATCC1011、ATCC9576、ATCC14488-14491、ATCC11601、ATCC12892、尸.c/owge"wmCBS455.95、尸ew/cz'〃/wmc"n'wwmATCC38065、尸ewz'"'〃ZwwcA/^^ogewwmP2、ylcrewow'謂c/,ogew訓(xùn)ATCC36225或ATCC48272、7Wc/ocfermflre^e/ATCC26921或ATCC56765或ATCC26921、A;wg"/MSscj/mATCC11906、CA^sos;o〃'wm/wc^zoweraeATCC44006及其衍生物。宿主細(xì)胞可以是野生型絲狀真菌宿主細(xì)胞或變體、突變體或經(jīng)遺傳修飾的絲狀真菌宿主細(xì)胞。這類經(jīng)修飾的絲狀真菌宿主細(xì)胞包括例如,具有降低的蛋白酶水平的宿主細(xì)胞,例如蛋白酶缺陷菌株如J^Wgz7/loo^aeJaL125(描述與WO97/35956或EP429490中);如WO96/14404中公開的三肽酰-氨肽酶-缺陷型Am'gw菌株,或具有降低的蛋白酶轉(zhuǎn)錄激活子生產(chǎn)的宿主細(xì)胞(parT;如WO01/68864、US2004/0191864A1和WO2006/040312中所述);宿主菌株如^spwgW/i^oo^aeBECh2,其中三個TAKA淀粉酶基因、兩個蛋白酶基因以及形成代謝產(chǎn)物環(huán)并偶氮酸和曲酸的能力已經(jīng)失活(BECh2在WO00/39322中公開);與野生型細(xì)胞相比包含提高的未折疊的蛋白質(zhì)應(yīng)答(UPR)以增強感興趣的多肽的生產(chǎn)能力的絲狀真菌宿主細(xì)胞(描述于US2004/0186070A1、US2001/0034045A1、WO01/72783A2和andWO2005/123763中);具有草酸鹽缺陷表型的宿主細(xì)胞(在WO2004/070022A2和WO2000/50576中描述);具有多量內(nèi)源多肽的降低的表達(dá)的宿主細(xì)胞,所述內(nèi)源多肽如葡萄糖淀粉酶、中性a-淀粉酶A、中性a-淀粉酶B、ce-l,6-轉(zhuǎn)葡萄糖基酶、蛋白酶、纖維二糖水解酶和/或草酸水解酶(如可根據(jù)US2004/0191864A1中所述技術(shù)通過遺傳修飾獲得);具有提高的同源重組效率的宿主細(xì)胞(如WO2005/095624中所述具有缺陷的或/^/B基因);和具有這些修飾的任何可能的組合的宿主細(xì)胞。在本發(fā)明的一種方法中,預(yù)定的氨基酸序列可以是與所述預(yù)定的宿主細(xì)胞異源的(感興趣的多肽的)氨基酸序列,或其可以是與所述預(yù)定的宿主細(xì)胞同源的(感興趣的多肽的)氨基酸序列。在關(guān)于核酸(DNA或RNA)或蛋白質(zhì)的方面,使用術(shù)語"異源的"表示下述核酸或蛋白質(zhì),其不作為其存在的生物、細(xì)胞、基因組或DNA或RNA序列的部分天然存在,或其存在于與其天然存在的細(xì)胞或位置基因組或DNA或RNA序列中的位點不同的地方。異源核酸或蛋白質(zhì)對其被引入的細(xì)胞而言不是內(nèi)源的,但是得自另一細(xì)胞或被合成或重組地生產(chǎn)。一般地(盡管并非必須),這類核酸編碼下述細(xì)胞通常不生產(chǎn)的蛋白質(zhì),所述核酸在所述細(xì)胞中被表達(dá)。在本文中術(shù)語異源核酸或蛋白質(zhì)包括本領(lǐng)域技術(shù)人員會識別為對于下述細(xì)胞是異源或外源的任何核酸或蛋白質(zhì),所述核酸或蛋白質(zhì)在所述細(xì)胞中被表達(dá)。術(shù)語異源的也適用于核酸或氨基酸序列的非天然的組合,即組合中至少兩條組合的序列對彼此而言是外源的。當(dāng)用于指出給定的(重組的)核酸或多肽分子與給定的宿主生物或宿主細(xì)胞之間的相互關(guān)系時,術(shù)語"同源的"應(yīng)當(dāng)被理解為表示該核酸或多肽分子天然地由相同物種(優(yōu)選地相同變種或菌株)的宿主細(xì)胞或生物生產(chǎn)。預(yù)定的氨基酸序列可以是具有商業(yè)或工業(yè)應(yīng)用性或?qū)嵱眯缘娜魏胃信d趣的多肽的序列。因此,感興趣的多肽可以是抗體或其部分、抗原、凝固因子、酶、激素或激素變體、受體或其部分、調(diào)節(jié)蛋白、結(jié)構(gòu)蛋白、受體、或運輸?shù)鞍?、?xì)胞內(nèi)蛋白質(zhì)、涉及分泌過程的蛋白質(zhì)、涉及折疊過程的蛋白質(zhì)、陪伴分子、肽氨基酸轉(zhuǎn)運蛋白、糖基化因子、轉(zhuǎn)錄因子。優(yōu)選地,感興趣的多肽通過經(jīng)典的分泌途徑、通過非經(jīng)典的分泌途徑或通過備選的分泌途徑(描述于WO2006/040340中)被分泌進(jìn)宿主細(xì)胞的細(xì)胞外環(huán)境中。如果感興趣的多肽是一種酶,則其可以是例如氧化還原酶、轉(zhuǎn)移酶、水解酶、裂合酶、異構(gòu)酶、連接酶、過氧化氫酶、纖維素酶、殼多糖酶、膠質(zhì)酶、過氧核糖核酸酶、葡聚糖酶、酯酶。更優(yōu)選的酶包括例如糖酶,例如纖維素酶如內(nèi)切葡聚糖酶、j8-葡聚糖酶、纖維二糖水解酶或j8-葡糖苷酶,半纖維素酶或果膠分解酶如木聚糖酶、木糖苷酶、甘露聚糖酶、半乳聚糖酶、半乳糖苷酶、果膠甲基酯酶、果膠裂合酶、果膠酸裂合酶、多聚半乳糖醛酸內(nèi)切酶、多聚半乳糖醛酸酶外切酶、鼠李半乳糖醛酸酶、阿拉伯聚糖酶、阿拉伯呋喃糖酶、阿拉伯木聚糖水解酶、半乳糖醛酸酶、裂合酶或淀粉酶;水解酶、異構(gòu)酶或連接酶、磷酸鎂如植酸酶、酯酶如脂酶、蛋白水解酶、氧化還原酶如氧化酶、轉(zhuǎn)移酶,或異構(gòu)酶、植酸酶、氨肽酶、羧肽酶、內(nèi)切蛋白酶、金屬蛋白酶、絲氨酸蛋白酶、過氧化氫酶、甲殼酶、角質(zhì)酶、環(huán)糊精葡萄糖基轉(zhuǎn)移酶、脫氧核糖核酸酶、a-半乳糖苷酶、/5-半乳糖苷酶、葡萄糖淀粉酶、a-葡糖苷酶、/3-葡糖苷酶、鹵素過氧化氫酶、轉(zhuǎn)化酶、漆酶、甘露糖苷酶、變構(gòu)酶(mutanase)、過氧化物酶、磷脂酶、多酚氧化酶、核糖核酸酶、轉(zhuǎn)谷氨酰胺酶、葡萄糖氧化酶、己糖氧化酶和單加氧酶。感興趣的若干種治療蛋白質(zhì)包括例如抗體及其片段、人胰島素及其類似物、人乳鐵蛋白及其類似物、人生長激素、紅細(xì)胞生成素、組織纖維蛋白溶酶原激活劑(tPA)或胰島素調(diào)理素(insulinotropin)。多肽可以涉及代謝產(chǎn)物(優(yōu)選地為檸檬酸)的合成。這類多肽例如包括烏頭酸水合酶、順烏頭酸酶、6-果糖磷酸激酶、檸檬酸合成酶、羧基磷?;┐急猁}磷?;兾幻?carboxyphosphonoenolpyruvatephosphonomutase)、乙醇酸還原酶、葡萄糖氧化酶前體goxC、核苷二磷酸糖差向異構(gòu)酶、葡萄糖氧化酶、錳超氧化物岐化酶、檸檬酸裂合酶、泛醌還原酶、載體蛋白、檸檬酸傳遞蛋白、線粒體呼吸蛋白質(zhì)和金屬傳遞蛋白。計算機、程序和數(shù)據(jù)運載體本發(fā)明又一方面涉及包含處理器和存儲器的計算機,所述處理器被設(shè)置為從所述存儲器讀取和寫入所述存儲器,所述存儲器包含數(shù)據(jù)和指令,所述數(shù)據(jù)和指令被設(shè)置為提供給處理器進(jìn)行本發(fā)明方法的能力。本發(fā)明另一方面涉及計算機程序產(chǎn)品,其包含數(shù)據(jù)和指令并被設(shè)置為可以負(fù)載于計算機的存儲器中,所述計算機也包含處理器,所述處理器被設(shè)置為從所述存儲器讀取和寫入所述存儲器,所述數(shù)據(jù)和指令被設(shè)置為提供給所述處理器進(jìn)行本發(fā)明方法的能力。本發(fā)明還在另一方面涉及用上文所定義的計算機程序產(chǎn)品提供的數(shù)據(jù)運載體。核酸分子本發(fā)明又一方面涉及包含編碼預(yù)定的氨基酸序列的編碼序列的核酸分子。編碼序列優(yōu)選地是與天然存在的編碼序列不類似的核苷酸序列。核酸分子中的編碼序列不是天然存在的核苷酸序列,而是人工的(即工程操作的)人造核苷酸序列,其基于下述方法產(chǎn)生并隨后作為有形的核酸分子合成,所述方法用于針對預(yù)定的宿主細(xì)胞根據(jù)本文定義的方法優(yōu)化單個密碼子和/或密碼子對偏向性。優(yōu)選地,編碼序列具有針對預(yù)定宿主細(xì)胞至少低于0.2,或更優(yōu)選地低于0.1和最優(yōu)選地低于0.02的^&(gj。更優(yōu)選地,編碼序列具有針對預(yù)定宿主細(xì)胞至少低于0的y一fe)。最優(yōu)選地,編碼序列具有針對預(yù)定宿主細(xì)胞至少低于-o.i,或更優(yōu)選地至少低于-o.2的y^p(g」。優(yōu)選地,經(jīng)優(yōu)化的基因^中密碼子對的數(shù)量含有至少60%、70%、75%、3680%、85%的密碼子對和最優(yōu)選地至少90%的密碼子對針對特定的宿主生物具有相關(guān)的負(fù)密碼子對。由編碼序列編碼的預(yù)定的氨基酸序列可以是本文如上定義的任何感興趣的多肽,預(yù)定的宿主細(xì)胞也可以是本文如上定義的任何宿主細(xì)胞。在核酸分子中,編碼序列優(yōu)選地與表達(dá)控制序列可操作地連接,所述表達(dá)控制序列能夠指導(dǎo)編碼序列在預(yù)定的宿主細(xì)胞中的表達(dá)。在本發(fā)明的上下文中,控制序列被定義為當(dāng)一起存在時與編碼序列可操作地連接的核苷酸序列,其包括對編碼要生產(chǎn)的多肽的核苷酸序列表達(dá)是必需的或有利的所有組件。對于編碼要生產(chǎn)的多肽的核苷酸序列而言,每個控制序列可以是固有的或外來的。這類控制序列可包括但不限于,前導(dǎo)序列、多聚腺苷酸化序列、原肽序列(propeptide)、啟動子、翻譯起始因子序列、翻譯起始因子編碼序列、翻譯轉(zhuǎn)錄終止子和轉(zhuǎn)錄終止子序列。例如為了引入特異限制性位點的目的,可以用連接子提供控制序列,以便于將控制序列與編碼多肽的核苷酸的編碼區(qū)連接。表達(dá)控制序列通常會最低限度地包含啟動子。本文使用術(shù)語"啟動子"是指一種核酸片段,其功能是控制一個或多個基因的轉(zhuǎn)錄,根據(jù)轉(zhuǎn)錄的方向位于基因的轉(zhuǎn)錄起點上游,并且結(jié)構(gòu)上由DNA-依賴性RNA聚合酶、轉(zhuǎn)錄起點和任何其它DNA序列的存在識別,所述任何其它DNA序列包括但不限于轉(zhuǎn)錄因子結(jié)合位點、阻抑因子和激活因子蛋白結(jié)合位點和本領(lǐng)域技術(shù)人員已知的直接或間接地作用以調(diào)節(jié)來自啟動子的轉(zhuǎn)錄量的任何其它核苷酸序列。"組成型"啟動子是在大部分環(huán)境和發(fā)育條件下有活性的啟動子。"誘導(dǎo)型"啟動子是在環(huán)境或發(fā)育調(diào)節(jié)下有活性的啟動子。當(dāng)DNA區(qū)段(如表達(dá)調(diào)控序列)被置于與另一DNA區(qū)段的功能關(guān)系中時,其是"可操作地連接"的。例如,如果啟動子或增強子刺激編碼序列的轉(zhuǎn)錄,則其與該編碼序列是可操作地連接的。如果信號序列的DNA被表達(dá)為參與多肽分泌的前蛋白,則該信號序列的DNA與編碼該多肽的DNA可操作地連接。一般地,可操作地連接的DNA序列是連續(xù)的,并且在信號序列的情況下,不僅是連續(xù)的而且是處于讀碼狀態(tài)的(inreadingphase)。然而,增強子不必須與它們調(diào)控其轉(zhuǎn)錄的編碼序列連續(xù)。連接用37本領(lǐng)域已知的手段通過在便利的限制性位點處或接頭、連接子或PCR片段處的連接完成。對適當(dāng)?shù)膯幼有蛄械倪x擇一般取決于被選擇用于表達(dá)DNA區(qū)段的宿主細(xì)胞。合適的啟動子序列的例子包括本領(lǐng)域公知的原核和真核啟動子(見例如SambrookandRussell,2001,"MolecularCloning:ALaboratoryManual(3rdedition),ColdSpringHarborLaboratory,ColdSpringHarborLaboratoryPress,NewYork)。轉(zhuǎn)錄調(diào)節(jié)序列典型地包括被宿主識別的異源增強子或啟動子。適當(dāng)啟動子的選擇取決于宿主,但是啟動子如trp、lac和噬菌體啟動子、tRNA啟動子和糖酵解酶啟動子是已知和可獲得的(見例如SambrookandRussell,2001,上文)??梢允褂玫膬?yōu)選的誘導(dǎo)型啟動子的例子為淀粉-、銅-、油酸-誘導(dǎo)的啟動子。絲狀真菌宿主細(xì)胞優(yōu)選的啟動子例如包括Am'gw的葡萄糖淀粉酶啟動子或Aoo^m的TAKA淀粉酶啟動子和WO2005/100573中描述的啟動子。本發(fā)明的核苷酸序列可還包含信號序列,或更確切地包含信號肽編碼區(qū)。信號序列編碼與多肽的氨基端連接的氨基酸序列,其能夠指導(dǎo)被表達(dá)的多肽進(jìn)入細(xì)胞的分泌途徑。信號序列通常含有約4-15個氨基酸的疏水核心,其常緊鄰地位于堿性氨基酸之前。信號肽的羧基端存在被單個插入氨基酸分開的一對小的、不帶電的氨基酸,所述單個慘入氨基酸定義了信號肽切割位點(vonHeijne,G.(1990)J.MembraneBiol.115:195-201)。盡管它們整體的結(jié)構(gòu)和功能相似,但是天然的信號肽不具有共有序列。合適的信號肽編碼區(qū)可得自來自^^W^7/m物種的葡萄糖淀粉酶或淀粉酶基因、來自//^owwcor物種的脂酶或蛋白酶基因、來自fcccAwom;;cMcewv^^的o;-因子基因、來自5a"7/w物種的淀粉酶或蛋白酶基因、或小牛前-原-凝乳酶基因。然而,本發(fā)明中可以使用能夠指導(dǎo)表達(dá)的蛋白質(zhì)進(jìn)入選擇的宿主細(xì)胞的分泌途徑的任何信號肽編碼區(qū)。絲狀真菌宿主細(xì)胞的優(yōu)選的信號肽編碼區(qū)是得自A戸^7tooo^eTAKA淀粉酶基因(EP238023)、^/ergz'〃mmgw中性淀粉酶基因、As^^^'〃mw&er葡萄糖淀粉酶、A/n'zo附wcor脂W^'天冬氨酸蛋白酶基因、//i/mko/a/awwg/"o犯纖維素酶基因、ifwmz.co/a/wso/ews鄉(xiāng)千會佳素醇、//M/w/co/a^wo/ew角質(zhì)酉每、Gflwcfe/a38awtorcfefl脂酶B基因或A/'zomMcorm〖e/ie!'脂酶基因的信號肽編碼區(qū)及其突變體、截短的和雜種信號序列。在本發(fā)明的一個優(yōu)選的實施方案中,編碼信號序列的核苷酸序列是下述編碼序列的一個完整部分,所述編碼序列針對預(yù)定的宿主關(guān)于單個密碼子和/或密碼子對偏向性被優(yōu)化。在本發(fā)明的核酸分子中,編碼序列還優(yōu)選地與翻譯起始因子序列可操作地連接。在真核生物中,起始因子ATG-密碼子之前的核苷酸共有序列(6-12個核苷酸)常由于Kozak在該客體上的初期工作而被稱作Kozak共有序歹iJ(Kozak,M.(1987):ananalysisof5'腸noncodingsequencesfrom699vertebratemessengerRNAs.15(20):8125-47)。包含由Kozak推出的+4個核苷酸的原始Kozak共有序列與高等真核生物中的翻譯起始相關(guān)聯(lián)。對原核宿主細(xì)胞而言,相應(yīng)的Shine-Delgamo序列(AGGAGG)優(yōu)選地存在于原核mRNA的5'-非翻譯區(qū),作為核糖體的翻譯起點作用。在本發(fā)明的上下文中,"翻譯起始因子序列"被定義為編碼多肽的DNA序列的開放讀碼框的起始因子或起始密碼子上游緊鄰的十個核苷酸。起始因子或起始密碼子編碼氨基酸甲硫氨酸。起始因子密碼子典型地為ATG,但是也可以是任何功能性起始密碼子如GTG、TTG或CTG。在本發(fā)明的一個尤其優(yōu)選的實施方案中,核酸分子包含編碼預(yù)定氨基酸序列的編碼序列,所述預(yù)定的氨基酸序列要在真菌宿主細(xì)胞中被表達(dá),即所述預(yù)定的宿主細(xì)胞優(yōu)選地是真菌,最優(yōu)選絲狀真菌。包含編碼序列(其根據(jù)本發(fā)明的方法針對在真菌中的表達(dá)被優(yōu)化)的核酸分子可還包含一個或多個下述元件l)真菌共有翻譯起始因子序列;2)真菌翻譯起始因子編碼序列;和3)真菌翻譯終止序列。共有的真菌翻譯起始因子序列優(yōu)選地由以下序列定義5'-mwChkyCAmv-3',使用編碼下述核苷酸的多義性(ambiguity):m(A/C);r(A/G);w(A/T);s(C/G);y(C/T);k(G/T);v(A/C/G);h(A/C/T);d(A/G/T);b(C/G/T);n(A/C/G/T)。根據(jù)一個更優(yōu)選的實施方案,該序列為5'-mwChkyCAAA-3,;5'-mwChkyCACA-3,或5'-mwChkyCAAG-3,。最優(yōu)選的,翻譯起始共有序列為5'-CACCGTCAAA-3,或5'-CGCAGTCAAG-3'。在本發(fā)明的上下文中,術(shù)語"共有翻譯起始因子編碼序列"在本文中被定義為編碼序列的開放讀碼框起始因子密碼子上游緊鄰的九個核苷酸(起始因子密碼子典型地為ATG,但是也可以是任何功能性起始密碼子如GTG)。一個優(yōu)選的真菌共有翻譯起始因子編碼序列具有以下的核苷酸序列5'-GCTnCCyyC-3',使用編碼核苷酸y(C/T)和n(A/C/G/T)的多義性。這導(dǎo)致翻譯起始因子編碼序列的16個變體,其中5'-GCTTCCTTC-3'是最優(yōu)選的。使用共有翻譯起始因子編碼序列,在所述氨基酸位置允許以下的氨基酸被編碼的多肽中位置+2處的丙氨酸,位置+3處的丙氨酸、絲氨酸、脯氨酸或蘇氨酸,和位置+4處的苯丙氨酸、絲氨酸、亮氨酸或脯氨酸。優(yōu)選的在本發(fā)明中,共有翻譯起始因子編碼序列對于編碼要生產(chǎn)的多肽的核酸序列是外來的,但是共有翻譯起始因子對真菌宿主細(xì)胞可以是固有的。在本發(fā)明的上下文中,術(shù)語"翻譯終止序列"被定義為從開放讀碼框或編碼序列3'端的翻譯終止密碼子開始的四個核苷酸。優(yōu)選的真菌翻譯終止序列包括5'-TAAG-3'、5'-TAGA-3'和5'-TAAA-3',其中最優(yōu)選5'-TAAA-3,。編碼要在真菌宿主細(xì)胞中被表達(dá)的預(yù)定氨基酸序列的編碼序列還優(yōu)選的關(guān)于單個密碼子頻率被優(yōu)化,使得至少一個、兩個、三個、四個或五個原始密碼子,更優(yōu)選的至少1%、2%、3%、4%、5%、10%、15%、20%、25%、50%、75%、80%、85%、90%或95%的原始密碼子被交換為同義密碼子,所述同義密碼子編碼與固有的密碼子相同的氨基酸并且與原始密碼子相比在表A中定義的密碼子使用中具有更高的頻率。表A:以%表示的最優(yōu)的絲狀真菌同義密碼子的密碼子頻率。.T..C..A..G.T..Phe0Ser21Tyr0Cys0..TT.Phe100Ser44Tyr100Cys100..CT..Leu0Ser0終止100終止0..AT..LeuSer終止Trp.G4013140100LeuProHisArgi,,/:AV/49..TLeuProHisArgc..386410051..CLeuProGinArgc..0000..ALeuProGinArgc..3201000GlieThrAsnSerA.273000.TlieThrAsnSerA..737010021.ClieThrLysArgA..0000..AMetThrLysArgA..10001000..GValAlaAspGlyG..27383649..TValAlaAspGlyG..54516435..CValAlaGluGlyG..002616.AValAlaGluGlyG..1911740..G一條進(jìn)一步更優(yōu)選的編碼預(yù)定的氨基酸序列(其要在真菌宿主細(xì)胞中被表達(dá))的編碼序列還優(yōu)選地關(guān)于單個密碼子頻率被優(yōu)化,使得至少一個、兩個、三個、四個或五個原始密碼子,更優(yōu)選的至少1%、2%、3%、4%、5%、10%、15%、20%、25%、50%、75%、80%、85%、90%或95%的原始密碼子被交換為同義密碼子,所述同義密碼子改變密碼子頻率使得所述頻率中所述密碼子的百分比和列出的最優(yōu)百分比之間的絕對差值在修飾后變得更小,應(yīng)用以下的最優(yōu)百分比列表由TGC(100%)編碼的半胱氨酸;由TTC(100%)編碼的苯丙氨酸;由CAC(100%)編碼的組氨酸;由AAG(100%)編碼的賴氨酸;由AAC(100%)編碼的天冬酰胺;由CAG(100%)編碼的谷氨酰胺;由TAC(100%)編碼的酪氨酸;由GCT(38.0。/cO、GCC(50.7%)或GCG(11.3%)編碼的丙氨酸;由GAC(63.2%)編碼的天冬氨酸;由GAG(74.2%)編碼的谷氨酸;由GGT(49.0%)、GGC(35.9%)、GGA(15.1%)編碼的甘氨酸;由ATT(26.7%)、ATC(73.3%)編碼的異亮氨酸;由TTG(12.70/。)、CTT(17.4%)、CTC(38.7%)、CTG(31.2%)編碼的亮氨酸;由CCT(35.6%)、CCC(64.4%)編碼的脯氨酸;由CGT(49.1%)、CGC(50.9%)編碼的精氨酸;由TCT(20.8%)、TCC(44.0%)、TCG(14.4%)、AGC(20.8%)編碼的絲氨酸;由ACT(29.7%)、ACC(70.3%)編碼的蘇氨酸和/或由GTT(27.4%)、GTC(54.5%)、GTG(18.1%)編碼的纈氨酸;所有其它可能的氨基酸編碼密碼子(0%)。上文定義的包含本發(fā)明的編碼序列的核酸分子(用于在預(yù)定的宿主細(xì)胞中表達(dá))可進(jìn)一步包含通常存在于表達(dá)載體中的元件,如可選擇的標(biāo)記物、復(fù)制起點和/或(優(yōu)選地通過基因組中預(yù)定位點的同源重組)促進(jìn)整合的序列。這類其它元件是本領(lǐng)域公知的并且不需要在本文中進(jìn)一步說明。本發(fā)明又一方面涉及包含本文如上定義的核酸分子的宿主細(xì)胞。宿主細(xì)胞優(yōu)選地是本文如上定義的宿主細(xì)胞。本發(fā)明還在又一方面涉及生產(chǎn)具有預(yù)定的氨基酸序列的多肽的方法。該方法優(yōu)選地包括在有助于所述多肽表達(dá)的條件下培養(yǎng)宿主細(xì)胞,所述宿主細(xì)胞包含本文如上定義的核酸分子,以及任選地,回收所述多肽。本發(fā)明再在又一方面涉及至少生產(chǎn)細(xì)胞內(nèi)和細(xì)胞外代謝產(chǎn)物之一的方法。該方法包括在有助于生產(chǎn)代謝產(chǎn)物的條件下培養(yǎng)本文如上定義的宿主細(xì)胞。優(yōu)選地,宿主中具有預(yù)定的氨基酸序列的多肽(其由如上所述的核酸分子編碼)涉及代謝產(chǎn)物的生產(chǎn)。代謝產(chǎn)物(其為初級或刺激代謝產(chǎn)物或二者,其為細(xì)胞內(nèi)、細(xì)胞外或二者)可以是可以在發(fā)酵過程中生產(chǎn)的任何發(fā)酵產(chǎn)物。這類發(fā)酵產(chǎn)物例如包括氨基酸如賴氨酸、谷氨酸、亮氨酸、蘇氨酸、色氨酸;抗生素,包括例如氨芐青霉素、桿菌肽、先鋒霉素、紅霉素、莫能菌素、青霉素、鏈霉素、四環(huán)素、泰樂菌素、大環(huán)內(nèi)酯和喹諾酮;優(yōu)選的抗生素為先鋒霉素和0-內(nèi)酰胺;脂質(zhì)和脂肪酸,包括例如多不飽和脂肪酸(PUFA);鏈烷醇如乙醇、丙醇和丁醇;多元醇如1,3-丙烷-二醇、丁二醇、甘油和木糖醇;酮如丙酮;胺、二胺、乙烯;類異戊二烯,如類胡蘿卜素、胡蘿卜素、蝦青素、番茄紅素、葉黃素;丙烯酸、甾醇如膽固醇和麥角固醇;維生素,包括例如維生素A、B2、B12、C、D、E和K,和有機酸,包括例如葡萄糖二酸、葡糖酸、戊二酸、己二酸、琥珀酸、酒石酸、草酸、乙酸、乳酸、甲酸、蘋果酸、馬來酸、丙二酸、檸檬酸、延胡索酸、衣康酸、乙酰丙酸、木質(zhì)酸、烏頭酸、抗壞血酸、曲酸和comeric酸;一種優(yōu)選的有機酸為檸檬酸。在該文件及其權(quán)利要求書中,動詞"包括"及其變化形式以其非限制性的含義被使用,表示該詞語后的項目被包括,但是未明確提到的項目不排除在外。另外,涉及元件時,不定冠詞"一個/種"("a"或"an")不排除存在多于一個/種元件的可能性,除非上下文清楚地要求有且僅有一個/種元件。不定冠詞"一個/種"("a"或"an")因此通常表示"至少一個/種"。實施例1.實施例1:密碼子對偏向性的分析1.1材料和方法U.l數(shù)據(jù)和軟件可以對全基因組序列數(shù)據(jù)中的編碼序列(CDS)以及來自它們的部分組(或部分基因組序列,例如cDNA/EST文庫,或甚至來自相關(guān)生物的多個基因組的部分基因組數(shù)據(jù))進(jìn)行密碼子對分析。本發(fā)明中使用的工具使用FASTA文件作為輸入值閱讀這些數(shù)據(jù)。所有計算中的大部分在MATLAB7.01(TheMathWorks,Inc.,www.mathworks.com〗中進(jìn)行,但是對得到的結(jié)果的一些詳細(xì)的分析而言,使用SpotfireDecisionSite8.0(Spotfire,Inc.,http:〃www鄰otfire.com/products/decisionsite.cfm)。對于A"&er而言,使用針對CBS513.88(Peletal.,2007,NatBiotech.21:221-231)全基因組預(yù)測的cDNA序列和一組479個高表達(dá)的基因的FASTA文件。另外,因為在中試規(guī)模發(fā)酵條件下A"&w的>14,000個基因中通常少于一半同時被表達(dá),所以來自使用這類條件獲得的24個基因芯片的數(shù)據(jù)被用于選取第二組基因并根據(jù)觀察到的mRNA水平(因為此時不能獲得其它數(shù)據(jù))將它們分級,從而能夠容易地鑒定任何大小的一組(推定地)高表達(dá)的基因,所述第二組基因僅包括在多種實驗中實際表達(dá)的基因(僅考慮具有至少18個"存在"口令的基因,使用AffymetrixMAS5.0陣列分析軟件;該集合包含4,584個基因)。對該分析而言,我們使用了基因的轉(zhuǎn)錄水平?;蛘咭部梢允褂枚康牡鞍踪|(zhì)表達(dá)數(shù)據(jù),例如蛋白質(zhì)的雙向凝膠電泳和隨后通過質(zhì)譜法的鑒定。然而,與mRNA水平的測定(例如使用基因芯片)相比,產(chǎn)生針對大組蛋白質(zhì)的蛋白質(zhì)表達(dá)仍然是耗時的。因此,本文完成了在翻譯實際發(fā)生之前研究密碼子偏向性對翻譯的影響。Gygi"(Yeast.Mol.Cel.Biol.19(3):1720-30)實際上發(fā)現(xiàn)了五.co/z'中"蛋白質(zhì)和mRNA表達(dá)水平與密碼子偏向性的關(guān)聯(lián)",盡管mRNA和蛋白質(zhì)表達(dá)水平的關(guān)聯(lián)僅僅是相當(dāng)初步的。因此,在本文上下文中,當(dāng)實際上僅測定了對轉(zhuǎn)錄水平的影響時會使用術(shù)語"表達(dá)水平"。對含有約4,000個基因的生物而言,能夠獲得并分析了一組300個高表達(dá)的基因。在該研究中考慮到的所有生物基因組的基本特性(然而,它們并非均會被詳細(xì)描述)的綜述見表l.l。在每個分析中忽略其中在末端外的另一位置包含一個或多個終止密碼子的(推定的)基因,和其長度不能被3整除的序列(即在測序期間可能發(fā)生移碼的序列)。每個基因的最先五個密碼子和最后五個密碼子也不考慮在內(nèi),因為這些位點可能涉及蛋白質(zhì)結(jié)合和釋放效率,并因而承受與序列其它部分不同的選擇壓力,因此此處的密碼子和密碼子對偏向性可能不具代表性。比20個密碼子更短的ORF(OPO^開放讀碼框)也從該分析中被省略。在表l.l中己經(jīng)考慮。表1.1若干種生物的核苷酸含量,包括ORF數(shù)量和以兆堿基對(Mbp)表示的基因組大小。生物名稱ORF數(shù)Mbp核苷酷A:含量cGT7,78210.6124%28%26%22%13,96218.4124%27%26%22%12,07416,2925%26%26%23%44生物名稱ORF數(shù)Mbp核苷醇a:含量c|gT4,4493,5426%24%|27%23%4,1043.6630%20%|24%26%co/z'4,2894.0924%25%|27%24%尺/acfc5,3367.5232%19%|21%28%尸.cA,ogewwm13,16417.5424%27%|25%23%6,4499.0133%19%|20%28%7,8947.6214%37%|35%13%71.re&seZ8,33111.4523%30%|28%20%l丄2預(yù)期的密碼子對出現(xiàn)(occurrence)為了分析密碼子對使用,首先計數(shù)每個單個密碼子和每個密碼子對的出現(xiàn),下文記為"由((c,,。)),其中ofo表示觀察到的。雙括號必須指出"觀察到的數(shù)值"即w。^是僅有一個自變量的函數(shù),其自身為對(在該情況下為密碼子對,即(c,,。))。這同樣適用于下文定義的針對密碼子對的所有函數(shù)。指數(shù)/、以及A:可以是1到64,指出內(nèi)部表征(intemalrepresentation)(根據(jù)其字母順序)中的密碼子編號。(。,c》表示密碼子對,c,為左側(cè)密碼子(即6-核苷酸序列的5'三聯(lián)體),而。為右側(cè)密碼子(即更接近3'端),以及針對每個密碼子^的出現(xiàn)數(shù)《"(cj(其中下標(biāo)w表示單個密碼子,上標(biāo)"http://指出表示該數(shù)字涉及全基因組,與可被用于表示單個基因g中密碼子比例的<")相反;密碼子對的函數(shù)如"血((c,,。))始終表示全基因組或更大組基因中的數(shù)量)。然后計算單個密碼子比例(注意在一些文章中,這些比例也被稱作頻率。然而,密碼子頻率也可表示密碼子的出現(xiàn)數(shù)除以所有密碼子的總數(shù))ix"(。)其中^yw(^)表示與q編碼相同的氨基酸并因而與Q同義的密碼子集合。因此,分?jǐn)?shù)線以下的總和數(shù)值等于整個蛋白質(zhì)組中由c,編碼的氨基酸的出現(xiàn)數(shù)。本文中使用的最重要的符號和式的簡表見附錄l。為了揭示是否某些所述的密碼子對優(yōu)選級僅僅是個體密碼子優(yōu)選級的結(jié)果,必須基于個體密碼子頻率計算每個密碼子對的預(yù)期值。這些使用下式計算":((c,,c》)":"(c,.;Kf(c》.z"。j(,c"))上標(biāo)OVW2用于將這些數(shù)值與使用下文提到的其它方法獲得的數(shù)值區(qū)分。在該等式的最后一個因數(shù)中,計算所有同義密碼子對的實際出現(xiàn)數(shù)的總和。因此,每個密碼子對的預(yù)期量是個體密碼子使用比例和各自氨基酸對出現(xiàn)數(shù)的乘積。GutmanandHatfield(1989,Proc.Natl.Acad.SciUSA巡:3699-3703)提出計算預(yù)期值的另一種方法。他們最初的途徑是個別地計算每個基因的密碼子頻率(即基因g中的密碼子量除以g中密碼子的總量,所述總量表示為lgl),然后將這些值與該序列中的密碼子對數(shù)(其為lgl-l)逐對相乘。lgllgl在該等式中"gW"表示GutmanandHatfield方法1(1989,上文)。這得到了針對每個基因預(yù)期的密碼子對值(上文等式中求和算子(sumoperator)后的部分),所述預(yù)期的密碼子對值隨后被加在一起,得到最終的預(yù)期值,所述最終的預(yù)期值通過定義針對相同基因組中不同基因間單個密碼子使用的可能偏差被調(diào)節(jié),但是不考慮氨基酸對使用中可能的偏向性。這表示如果某些氨基酸傾向于比其它的更經(jīng)常彼此緊鄰,或者也就是說如果氨基酸對的的發(fā)生不與它們在具有相同氨基酸組成的隨機化的序列中會發(fā)生的不相似,則預(yù)期值也應(yīng)顯著不同,因為編碼很少使用的氨基酸對的密碼子對會具有過高的預(yù)期值,而編碼更常使用的氨基酸對的密碼子對會具有過低的預(yù)期值。GutmanandHatfield(1989,上文)還提出了標(biāo)準(zhǔn)化他們關(guān)于氨基酸對偏向性的預(yù)期值。因此,他們將根據(jù)他們的方法得到的氨基酸對預(yù)期值與觀察到的值簡單地比較,并相應(yīng)地按比例計算所有受影響的密碼子對的預(yù)期值使得前者與后者匹配46i:《uce—。)在該等式中"gA2"表示GutmanandHatfield方法2(1989,上文)l丄3計算密碼子對偏向性然后實際的密碼子對偏向性&'"<&,"應(yīng)當(dāng)?shù)米灶A(yù)期的和實際的(觀察到的)密碼子對數(shù)之間的差異(其中可使用針對預(yù)期值的任何這些方法)。最初的途徑簡單地通過一"邵((c,,。))計算。以這種方式,該偏向性值會指出事實上使用的密碼子對比預(yù)期的多或少多少個百分比(即乘以100%)。對于被分析的基因集合中不存在的氨基酸對而言,根據(jù)該式的偏向性值對所有相應(yīng)的密碼子對而言應(yīng)當(dāng)是0/0。在該情況下,其被定義為0。因此偏向性值的下限應(yīng)當(dāng)是-1,而不存在清楚的上限。這被認(rèn)為有些不實用,因此使用max(w血((c,.,。)),"嘩((c,.,。.)))來代替,其中max(a,b)表示a和b兩個數(shù)值中更大的一個,這總是導(dǎo)致在(-l,l)中的偏向性值。這表示偏向性值可以是-1,但是不是+1。前者發(fā)生于某一密碼子對完全不被用于編碼真實發(fā)生的氨基酸對;數(shù)值+1不能達(dá)到因為那時"^((。,e,))會是0,但是這僅在"J(。,。))也是0時是可能的。上文給出的解釋對于O(這表示"。J(c,A))〈"^((c,,。)),因此兩式具有相同的結(jié)果)的偏向性值仍然有效。如果"^(",。))>"^((。,。)),則偏向性值(此時其>0)指出預(yù)期的值比觀察到的值低多少個百分比(即在該情況下基線被改變)。l丄4偏向性的統(tǒng)計學(xué)顯著性GutmanandHatfield(1989,上文)使用^-檢驗測定他們的結(jié)果的統(tǒng)計學(xué)顯著性。該檢驗被用于檢査在特定的假設(shè)下某觀察到的結(jié)果偶然發(fā)生47的可能性。當(dāng)檢査密碼子對時,該假設(shè)應(yīng)當(dāng)是密碼子對使用是獨立地隨機選擇每個密碼子的結(jié)果。為了檢驗該假設(shè),計算^-值<formula>formulaseeoriginaldocumentpage48</formula>(其中CP表示不包括終止密碼子的所有密碼子對的集合)。此時自由度為3720(61*61-1)。如果密碼子對選擇是隨機的,則會預(yù)期^-值在3720左右(等于自由度),標(biāo)準(zhǔn)差等于2*自由度的平方根。以這種方式,可以檢驗觀察到的偏向性的總體統(tǒng)計學(xué)顯著性。然而,也可以演繹個體密碼子對偏向性的統(tǒng)計學(xué)顯著性。如前文提出的計算預(yù)期值的方法一樣,密碼子對的出現(xiàn)數(shù)被認(rèn)為是一系列的獨立的是/否實驗(是這兩個密碼子被選擇用于編碼各自的氨基酸對;否另一個密碼子對被選擇)的結(jié)果,因此其允許二項分布,如果被分析的基因集合足夠大的話這可以被近似為正態(tài)分布。如果n*p>4的話這被認(rèn)為是良好的逼近,其中n表示實驗數(shù),p表示"是"的概率,其也是預(yù)期值。因此,對于每個密碼子對而言,標(biāo)準(zhǔn)差可以根據(jù)下式計算=V"邵((c',。)).(1—C(c》《"(。))。然后可以計算標(biāo)準(zhǔn)分?jǐn)?shù)(也被稱作z-分?jǐn)?shù))"、、_("。fa((C,,。))-"exP((C,,。))琳',。"-。z-分?jǐn)?shù)的絕對值指出實際(觀察到的)值偏離預(yù)期值多少標(biāo)準(zhǔn)差。假定為正態(tài)分布,則所有觀察結(jié)果約95%應(yīng)當(dāng)偏離預(yù)期值在兩倍的標(biāo)準(zhǔn)差內(nèi),>99%應(yīng)當(dāng)偏離預(yù)期值在三倍的標(biāo)準(zhǔn)差內(nèi)。1.2結(jié)果1.2.1密碼子對偏向性的存在使用上述的方法,我們發(fā)現(xiàn)存在顯著的密碼子對偏向性。對于所有被研究的生物而言,^-檢驗給出了^-值,其為自由度的若干倍高并因此也高于預(yù)期值多倍標(biāo)準(zhǔn)差。如針對個體密碼子對的偏向性一樣,可以證實Mouraetal.的發(fā)現(xiàn)在酵母中"約47%的密碼子對上下文落入"偏離預(yù)期48值"-3到+3"個標(biāo)準(zhǔn)差"的區(qū)間中"(盡管他們以不同的方式計算預(yù)期值),所述預(yù)期值對應(yīng)于我們分析中的z-分?jǐn)?shù)。總之,如果密碼子對使用是隨機的,則存在比應(yīng)當(dāng)有的顯著更多的具有相當(dāng)高z-分?jǐn)?shù)的密碼子對。見表1.2:使用會導(dǎo)致近似正態(tài)分布的隨機選擇時,例如僅約5%的所有密碼子對會具有大于2或小于-2的z-分?jǐn)?shù),但是在選擇的四個生物的全基因組中,這實際上適用于多于三分之二。表1.2.不同生物中的Z-分?jǐn)?shù)lz-分?jǐn)?shù)l>1>2>3正態(tài)分布68.3%5.0%0.3%86.1%73.7%60.4%89.2%79.1%69.7%j.O,M88.4%76.7%65.1%88.1%76.4%64.0%86.1%72.0%59.3%五.尺"86.1%74.8%64.0%82.6%67.0%53.4%尸.c/,ogewww89.3%79.1%69.0%82.7%67.6%52.1%82.0%66.5%53.5%7!"65^89.0%79.8%71.00/0注意這些值與基因組大小有些相關(guān)(見表1.1的比較),即具有更大基因組的生物趨向于具有有更極端z-分?jǐn)?shù)的密碼子對。特別是當(dāng)分析更小的基因(例如A中479個高表達(dá)的基因)組時,該數(shù)值更低(對該例子分別為65.1%、37.2%禾卩19.7),因為更小的出現(xiàn)數(shù)導(dǎo)致(與預(yù)期值相比)更高的標(biāo)準(zhǔn)差,并因此導(dǎo)致結(jié)果的更低的統(tǒng)計學(xué)顯著性。這導(dǎo)致下述結(jié)論密碼子對使用不是根據(jù)單個密碼子比例的密碼子隨機選擇的結(jié)果。偏向性值自身的分布在一個生物和另一個生物間差異。這可以根據(jù)圖3解釋,圖3顯示了不同生物中3,721個正義:正義密碼子對的密碼子對偏49向性值的分布。圖3中每個直方圖右上角的數(shù)字是觀察到的分布的標(biāo)準(zhǔn)差;對所有生物而言平均值(未顯示)在-0.06和0.01之間。在圖3所示直方圖中,可以看到在測試的十個生物中,細(xì)菌五.co/z'、及^Z^7m、及和&coe/z'co/or具有最極端的密碼子對偏向性,而真菌爿.、^4.07zae、A/e/reus、j.wV/w/am1、尸.c/o^ogewwm禾口酵母51.ce"Ww'ae和K/acto中的偏向性較不極端。比較不同生物的密碼子對偏向性時可以得到另一有趣的觀察結(jié)果。來自相關(guān)生物的偏向性值顯示比來自無關(guān)生物的這些更高的關(guān)聯(lián)。這根據(jù)圖4解釋。圖4顯示了多種生物的密碼子對偏向性之間的關(guān)聯(lián)。相關(guān)系數(shù)在每個小圖的右上角顯示。在該分析中,可以在乂.m'ger對尸.c/io^oge"wm,和A對Aoo^ae之間觀察到最高的相關(guān),可以在5.和S.co^'co/or之間觀察到最低的、即觀察不到有效的相關(guān)。有趣的是,未觀察到負(fù)相關(guān)。這表示盡管具有高GC-含量的生物(如5l"^'"/w)大部分偏好在富含AT的生物(如S.cwwW^,或盡管其并非極端富含AT的及w6"fo)中較少使用的這些密碼子,但是不存在兩種生物,其一優(yōu)選的對在另一中可能被拒絕且反之亦然。這可表示盡管幾乎每個單個密碼子的偏向性都是生物依賴性的,但是存在在幾乎每個生物中都被優(yōu)選的和/或被拒絕的若干密碼子對(例如因為它們引起移碼或具有不匹配結(jié)構(gòu)的tRNA的可能性)。1.2.2密碼子對偏向性的模式為了顯示觀察到的密碼子對偏向性,可以如Mouraetal.(2005)所做繪制所謂的圖譜(他們將這些圖譜稱作"密碼子上下文圖譜")。這可以最容易地根據(jù)彩色圖片解釋,所述圖片由針對每個密碼子對的彩色矩形組成,行表示對的第一密碼子,列表示對的第二密碼子。紅色表示負(fù)偏向性,綠色表示正偏向性。白色表示實際上具有等于0的偏向性的密碼子對(其為例如ATG-ATG的情況,因為這是編碼氨基酸對Met-Met的唯一方式)和摻合了終止密碼子的對。然而,彩色的圖片不能是專利申請公開內(nèi)容的部分。為了進(jìn)行黑白的顯示,在該實施例中將圖片拆分為兩個圖片。圖5A展示了Am'ger的正密碼子對,而圖5B展示了的負(fù)密碼子對(也見附錄3,表Cl)。密碼子對越偏向,則相應(yīng)的矩形越黑。此處的偏向性值范圍在-0.67和0.54之間,而在其它生物中他們甚至可能稍高于+/-0.9(也見圖3)。這些矩形中最高的黑色強度(原始的綠色(頂部)和黑色(原始紅色(底部)))分別表示0.9和-0.9的值(此處未達(dá)到;通常最大偏向性的絕對值稍低于最小偏向性的絕對值)。另外,我們在附錄3中提到CPW矩陣-表,其含有密碼子對的偏向性的數(shù)值,我們提到圖5作為彩色圖片的黑色和白色例子,從而技術(shù)人員能夠使用來自附錄3的表中的數(shù)值重建彩色的版本。這些密碼子對圖譜的第一條途徑是根據(jù)其字母順序(因為這是它們的內(nèi)部表征)對行和列排列。該圖譜中可以看出對角線似乎含有比紅點稍多的綠點,這指出許多密碼子具有與其鄰居相同的密碼子偏好。另外,大部分相鄰的列有些相似,而相鄰的行則不然(數(shù)據(jù)未顯示),見圖5A和5B和附錄3,表C1。然而大部分行與隔著另外三行的行相似,即每四行存在一些相似性。因為每四行的普遍特性是所述對的第一個密碼子的最末核苷酸,所以更優(yōu)選地根據(jù)第三位的字母順序作為第一排列標(biāo)準(zhǔn)、中位的字母順序作為第二排列標(biāo)準(zhǔn)對行進(jìn)行排列。然后可以在Am'gw圖譜(圖5C和D,和附錄3,表Cl)中看到的是偏向性似乎的確主要與第一個(5')密碼子的最末核苷酸和第二(3')密碼子的第一個核苷酸相關(guān),因為16*16密碼子對的各塊值中大部分具有相同的顏色。例如,在Aspergillus中可以鑒定的一個普遍規(guī)律是密碼子對如xxT-Axx(x表示任何核苷酸,指出各位置上的核苷酸對于所說明的規(guī)律不重要)被拒絕(左下角的紅色塊),而模式xxA-Txx表征了優(yōu)選的密碼子(右上角的綠色塊),再次指出密碼子對偏向性是有方向性的。然而,并非所有偏向性可以僅用密碼子對"中間"兩個相鄰的核苷酸中的模式解釋。例如xxC-Axx密碼子對(見最左側(cè)從頂部起第二塊)通常不被優(yōu)選或拒絕,但是對xxC-AAx模式的對(注意剛提到的塊左側(cè)的四個綠色塊)存在清楚的偏好。偏向性也可取決于不相鄰的核苷酸(例如及s由zfc中對CxA-Gxx對的強拒絕;見圖6A和6B和附錄3,表C4)。不幸的是,密碼子對偏向性不能總是歸因于這類"簡單"的模式(見例如圖7A和B和附錄3,表C5中針對S的相當(dāng)混亂的圖譜)一甚至當(dāng)使用SpotfireDecisionSite8.0(http:〃www.spotfire.com/products/decisionsite.cfm)進(jìn)《f聚類分豐斤日寸,也未發(fā)現(xiàn)普遍特性(數(shù)據(jù)未顯示),即鑒定的聚類主要由不相關(guān)的密碼子組成(即相同位置沒有通用核苷酸)。1.2.3偏向性和表達(dá)水平的關(guān)系觀察A的具有高表達(dá)水平(或更好地為假定高表達(dá)水平,因為它們僅通過觀察轉(zhuǎn)錄水平被鑒定)的基因的偏向性圖譜(見圖8),更大組(即簡圖中的塊)的存在不是同樣明顯的(或者,也就是說,如上所述的簡單規(guī)律可能完全不存在)。然而因為所有密碼子對中的三分之二在該組中出現(xiàn)36倍或更少倍,還因為如上所述平均低得多的z-分?jǐn)?shù),人們能夠?qū)⑦@歸因于大范圍的隨機變異。圖9顯示Am'gw的一組479個高表達(dá)基因的偏向性(垂直軸)對所有基因中偏向性(水平)的散布圖。顯示了不涉及終止密碼子的所有3,721個密碼子對。從淺灰到黑色的陰影處理根據(jù)全基因組中z-分?jǐn)?shù)的絕對值指定(即圖中的淺色點在所有的基因中不具有顯著偏向性),大小根據(jù)高表達(dá)組中z-分?jǐn)?shù)的絕對值指定,即非常小的點在其中不具有顯著偏向性(此處lz-分?jǐn)?shù)|<1.9)。黑色實線指出兩個偏向性數(shù)值相等的地方;黑色虛線顯示實際相關(guān)的最佳線性逼近(通過主成分分析鑒定的);其斜率在2.1左右。將每個密碼子對在高表達(dá)組中和全基因組中的兩個偏向性值進(jìn)行比較時(見圖9中的散布圖),可以看出對大部分對而言,高轉(zhuǎn)錄組中的偏向性更加極端,即如果低于0則更低且如果為正則更高,但是存在一些對,其偏向性值相當(dāng)不同,甚至具有不同的標(biāo)志。然而,這些主要是在頂部組(topgroup)中小量發(fā)生的密碼子對,其中偏向性高度顯著的大部分對(藍(lán)色,大圈)在兩組中具有相似的偏向性(即它們接近藍(lán)色線,所述藍(lán)色線指出兩個偏向性值相等的位置)。52未能(無論是針對Am'gw還是針對均及^/6"fc未能)找到分享三個核苷酸中兩個的、涉及密碼子的相似偏向性差異的特定模式,即在與上圖相同的密碼子差異圖中,不存在具有相似的偏向性差異的更大組。1.3.為了基因適應(yīng)鑒定密碼子對權(quán)重的細(xì)節(jié)目前可以根據(jù)所述的方法來測定用于適應(yīng)的密碼子對權(quán)重(附錄1:密碼子對權(quán)重-方法一個序列組(或基因組))1.基于基因的全集;基于1的子集。2.被識別為高表達(dá)基因的部分。另外,我們開始搜索以鑒定與更高的轉(zhuǎn)錄水平明顯相關(guān)的密碼子對權(quán)重,其是適應(yīng)密碼子對使用的改進(jìn)的方法所需要的,己經(jīng)應(yīng)用以下的方法在中計算每個基因的平均密碼子對權(quán)重(即刀^(g)值的等價物),所述Am'gw中可以針對上述4,584個實際表達(dá)的基因集合獲得取自GeneChip數(shù)據(jù)的完全分級(見"材料和方法"中的"數(shù)據(jù)")。然后根據(jù)適合度值(升序)和表達(dá)水平(降序)來排列基因。因為高表達(dá)的基因應(yīng)該具有低密碼子對適合度值,當(dāng)使用理想的密碼子對權(quán)重時這兩個分級應(yīng)當(dāng)是相等的,因此這兩個分級的比較能夠給出關(guān)于適合度函數(shù)(其中對高表達(dá)基因的"正確"分級給予了比中等表達(dá)基因的分級稍微更多的關(guān)注)中使用的權(quán)重品質(zhì)的信息。另外,計算了4,584個基因的分級和平均密碼子對權(quán)重之間的相關(guān)系數(shù)(協(xié)方差除以每個變量的標(biāo)準(zhǔn)差)。已經(jīng)檢査了若干種可能的權(quán)重集合,包括i.來自全基因組的偏向性值,ii.高表達(dá)組的偏向性值,iii.具有所有值的偏向性,其不具有設(shè)置為零的確定的最小z-分?jǐn)?shù)iv.上升至2的冪次(和一些其它值)的偏向性值,以給予高表達(dá)的或拒絕的密碼子更低/更高的影響v.其組合vi.z-分?jǐn)?shù)自身"1.來自高表達(dá)組和全基因組的偏向性值/2-分?jǐn)?shù)的差異。53對遺傳算法而言,它們的求反(negation)己被使用,因為已用正值(相當(dāng)隨意地)識別優(yōu)選的密碼子對,然而GA進(jìn)行最小化。這適用于所有提到的權(quán)重。其中,"最佳"的權(quán)重矩陣最終是ii到iv項的組合,然而,使用基于全基因組的密碼子比例計算的預(yù)期值通過計算高表達(dá)組中的密碼子對"偏向性",如上所述可以獲得進(jìn)一步更好的矩陣。圖10顯示觀察到的相關(guān)。與測試的所有其它權(quán)重結(jié)合不同,涉及在高表達(dá)組中更不足量表現(xiàn)的密碼子的密碼子對在本文中具有輕度的缺點。因此,這些權(quán)重是僅有的也反映高表達(dá)的組和所有基因的不同單個密碼子偏向性的權(quán)重。使用這些權(quán)重帶有拒絕下述一些密碼子對的風(fēng)險,所述密碼子對實際上在高表達(dá)的組中具有正偏向性,但是由(高表達(dá)的組中)很少使用的密碼子組成。然而,因為我們期望的單個密碼子比例通常與具有高表達(dá)的基因組中的單個密碼子比例并不相同,而是比它們更加"極端",單個密碼子優(yōu)化無論如何會代替這些不足量表現(xiàn)的密碼子,因此我們能夠認(rèn)為上文所述的權(quán)重對于密碼子對優(yōu)化是非常方便的??偠灾航?jīng)如上所述鑒定了用于基因適應(yīng)的被潛在地改進(jìn)的密碼子對權(quán)重矩陣。方程在附錄1中給出密碼子對權(quán)重一方法高表達(dá)組與參考組(或基因組)。1.4.在計算機芯片上的單個密碼子和密碼子對優(yōu)化1.4.1材料和方法用于分析和優(yōu)化基因的被開發(fā)的MATLAB工具箱由若干個函數(shù)組成,這些函數(shù)根據(jù)其能力被組織在不同的目錄中。因此為了使用它們,必須使得它們對MATLAB環(huán)境而言均是已知的。為此,從文件菜單中選擇"設(shè)置途徑(SetPath)"并點擊"從子文件夾添加(Addst^/oWera)"并選擇安裝該工具箱的途徑(通常被稱作"Matlab-bio")。還添加FASTA和應(yīng)當(dāng)被分析的其它文件的位置。所有個體MATLAB函數(shù)簡要描述于"contents.m"中(鍵入"helpMatlab-bio"從而在MATLAB環(huán)境中顯示該文件,并且在函數(shù)名前使用"help"得到關(guān)于該函數(shù)的詳細(xì)信息)。對于關(guān)注54于密碼子對使用的基因優(yōu)化而言,兩個重要的函數(shù)是"follanalysis"和"geneopt"。如果你希望一個基因所適應(yīng)的生物的全基因組位于該文件(即"Jm'gw—O^F/osto")中且其高表達(dá)基因的標(biāo)識符在"a"-Z^g/.加"中,鍵入"fullanalysis('Aniger—ORF.fasta','an-high.txt','an');,,后,你會得到(i)全基因組的密碼子對偏向性圖譜,(ii)第二個文件中基因的組的密碼子對偏向性圖譜,和(iii)MATLAB工作空間中用于進(jìn)一步用途的若干個變量(即臨時存儲的數(shù)據(jù)集合)。"follanalysis"的第三個參數(shù)僅確定這些變量如何被命名,并且如果同時只要分析一個基因組的話可以被省略。所提到的變量中有(i)全基因組的密碼子對使用和偏向性數(shù)據(jù)(在該實施例中稱作"^朋"),(ii)由第二參數(shù)說明的特定組基因的密碼子對使用和偏向性數(shù)據(jù)(稱作"c/^ra")和(iii)下述結(jié)構(gòu),其具有能夠用于遺傳算法的目標(biāo)單個密碼子比例和密碼子對權(quán)重。"fiillanalysis('Xyz—ORF.fasta,);"僅會顯示密碼子對偏向性圖譜并存儲各自基因組的偏向性數(shù)據(jù)。盡管第二參數(shù)可以是包含基因標(biāo)識符的任何文件(例如具有低表達(dá)的基因或具有某共有功能的基因的集合),但是其總是被對待為關(guān)于該(潛在的)參數(shù)的高表達(dá)基因的集合(在該實施例中稱作"o;^;w畫/ora"",這表示牛寺定生物的優(yōu)"f七參數(shù)(^e0/"/w'za"o"_pflraw"erAes/e"yo/orgasm))。注意此處的單個密碼子比例被簡單地計算Cg"")=2^sA")-《""),這是可接受的近似值。同樣可以通過其它方法鑒定目標(biāo)比例從而進(jìn)一步改進(jìn)期望比例的規(guī)格,所述其它方法包括單個密碼子分布的細(xì)節(jié)(見主文本)。另外,當(dāng)未發(fā)現(xiàn)特異偏向性時目標(biāo)比例可以保持為空,從而在尋找具有更高密碼子對適合度時給予密碼子對算法更多自由度。附錄1中針對多種宿主生物給出了若干個這類預(yù)定的單個密碼子目標(biāo)載體。為了將預(yù)定的單個密碼子目標(biāo)比例用于遺傳算法,如下改變參數(shù)的字段(field)"c,,鍵入"optparamforan.cr=[",然后粘貼單個密碼子比例(例如從Excel表格中拷貝;注意它們應(yīng)當(dāng)按照密碼子的字母順序),如果該比例可以作為64-元行獲得則鍵入"];",或如果它們從列中拷貝則鍵入55"],;",按回車(注意后一情況下方括號后額外的單引號或上撇號)。不重要的密碼子比例(即其中不期望特定的目標(biāo)比例)可以被指派"數(shù)值"NaN(不是數(shù)字),并且在計算單個密碼子適合度時它們會被忽略。為了從優(yōu)化的基因中排除某些短序列,以同樣的方式設(shè)定參數(shù)'、",其中每條序列必須括在單引號內(nèi),且所有的序列必須被一起括在大括號內(nèi)例如(無斷行)"optparamforan.rs={'CTGCAG''GCGGCGCC'〉;"。最后,可以改變參數(shù)的字段cp',給予單個密碼子優(yōu)化或密碼子對優(yōu)化在組合的適合度函數(shù)中更高的重要性(見"結(jié)果和討論"中的分段"進(jìn)行密碼子對優(yōu)化")。默認(rèn)值為0.2。如果用密碼子對優(yōu)化的基因的實驗結(jié)果顯示密碼子對優(yōu)化的基因與單個密碼子優(yōu)化的基因相比很少的改進(jìn),則將其設(shè)定為更低的值;在相反的情況下,更高的c^'可能更好。然后可以使用函數(shù)geneopt進(jìn)行使用遺傳算法的基因?qū)嶋H優(yōu)化。只需要下述參數(shù)要被優(yōu)化的序列和含有密碼子對權(quán)重的結(jié)構(gòu)、如上所述的目標(biāo)比例和限制性位點,因此geneopt('MUVARNEQST*,,optparamforan);可以例如被用于優(yōu)化給定的(相當(dāng)短的)蛋白質(zhì)序列從而在中高表達(dá);'*,被用于表示得到的遺傳序列應(yīng)當(dāng)在末端具有終止密碼子(然而,因為中的最優(yōu)終止信號被認(rèn)為是四聚體TAAA,所以這不是必要的)。注意要被優(yōu)化的序列也必須括在單引號內(nèi);如果該序列僅含有字母A、C、G、T或U且其長度為3的倍數(shù),則其被自動地識別為核苷酸序列。然后將遺傳算法進(jìn)行1000世代,種群大小為200,其中80個被該世代保留(79個最優(yōu)的和一個隨機挑選的)并用于產(chǎn)生新個體,其中40%的新個體使用交換產(chǎn)生,60%的新個體使用突變算子產(chǎn)生。這些默認(rèn)值證明對優(yōu)化是非常便利的,即這些參數(shù)中的改變僅會(如果根本的話)導(dǎo)致非常輕微"更好"的基因,但是如果在優(yōu)化上應(yīng)當(dāng)花費顯著更多或更少的計算時間,則它們可同樣被改變(在1.4GHzPentiumM處理器上用約500個密碼子的基因平均運行g(shù)eneopt耗時約15分鐘)。geneopt(seq,optparamforan,[50750500.6])會例如使得遺傳算法計算種群的750個世代,其中每個新世代保留50個個體并新產(chǎn)生250個個體(5*50;即在每代中檢查300個個體),僅保留最優(yōu)的(并且無隨機挑選的)個體且60%的重組使用交換算子進(jìn)行。關(guān)于如何明確這些參數(shù)的更多細(xì)節(jié),鍵入helpgeneopt禾口helpgeneticalgorithm。注意盡管本文中針對A"&er和及犯6rifc顯示和描述了通過分析相應(yīng)的FASTA文件產(chǎn)生密碼子對權(quán)重的步驟,但是只有對這兩種生物這不是必要的,因為已針對先前的基因優(yōu)化進(jìn)行過這些計算。為了更簡單的使用,已存儲了遺傳算法的各自參數(shù)(分別鍵入"loadgadata一for—an"或"loadgadata—for一bs";注意參數(shù)現(xiàn)在僅簡單地稱作anjaram和bs_param)。1.4.2結(jié)果圖ll顯示了五個優(yōu)化版本的適合度值,各針對不同的c^值(見圖ll中的圖例)。蛋白質(zhì)為針對宿主Am'ger(見實施例2)優(yōu)化的真菌o;-淀粉酶(FUA;也稱作AmyB)。另外,顯示了"純"單個密碼子優(yōu)化(右側(cè)黑點)的結(jié)果和密碼子對優(yōu)化的結(jié)果。通過對400個種群大小將遺傳算法進(jìn)行IOOO世代左右獲得優(yōu)化的版本,每個世代在1.4GHzPentiumM上運行耗時約17分鐘。注意純單個密碼子優(yōu)化和純密碼子對優(yōu)化僅耗費該時間的約60%。在圖11中,野生型WUg—)=0.165,y^p(g—)=0.033)不適合該圖(其應(yīng)該在右上遠(yuǎn)處)。最優(yōu)的基因總是具有最低刀4c和刀一值的基因??紤]到點的位置,不清楚針對哪個c^值能夠獲得最改進(jìn)的基因,因為我們還不知道是單個密碼子使用還是密碼子對使用更加重要。然而,一種費用平衡(faretrade-off)似乎在cpz'=0.2的情況下發(fā)生。單個密碼子和密碼子對使用中的改進(jìn)可以顯示在該工作中提出的所謂序列品質(zhì)繪圖中。圖12闡述了兩幅簡圖,其顯示上述FUA(也見實施例2)的(449個中)最初20個密碼子的序列品質(zhì)。注意這些序列品質(zhì)簡圖不僅取決于序列自身,而且取決于權(quán)重和期望的單個密碼子比例的設(shè)置,并因此取決于生物。注意對于具有低或無密碼子偏向性的密碼子而言,還可能將目標(biāo)單個密碼子比例定義為"不關(guān)注",即不考慮某密碼子的使用與其同義密碼子相比對于表達(dá)是正還是負(fù)。在該情況下,基因中各自密碼子的實際比例僅57顯示藍(lán)色的X-標(biāo)記,并且計算單個密碼子適合度時該具體的位置被忽略(見1.4在計算機芯片上的單個密碼子和密碼子對優(yōu)化)。1.5結(jié)論已經(jīng)在大范圍的生物中確立了密碼子對使用和轉(zhuǎn)錄水平的顯著相關(guān)。這證實了該偏向性不能僅通過開發(fā)讀碼位點周圍的單核苷酸偏向性來解釋。因為偏好或拒絕某密碼子對的可能的解釋均集中在翻譯上,所以應(yīng)當(dāng)假定偏好或拒絕二者均由天然選擇引起,所述天然選擇同時作用于影響翻譯的特征和影響轉(zhuǎn)錄的其它特征,從而最小化細(xì)胞生產(chǎn)酶或至少更重要的酶的作用。除了經(jīng)典的單個密碼子優(yōu)化或密碼子對調(diào)和外,可因此考慮在多肽編碼序列中優(yōu)化密碼子對使用以達(dá)成改進(jìn)的過表達(dá),其中對于優(yōu)化僅考慮單個密碼子頻率。對于在該實施例中被研究的真菌宿主種類和bacilli而言,相同基因的密碼子對適應(yīng)和單個密碼子適應(yīng)僅輕微地干擾,即二者可同時進(jìn)行且結(jié)果會比野生型基因具有"更好"的單個密碼子使用和"更好"的密碼子對使用,且當(dāng)忽略另一個時,兩方面中的任何能夠僅被輕微地改進(jìn)。為了閱讀FASTA文件,并進(jìn)行分析和優(yōu)化,已開發(fā)了用戶友好的MATLAB函數(shù)。已介紹了展示單個基因的密碼子對偏向性和密碼子對使用的新方法,見實施例2和實施例4。針對優(yōu)化設(shè)計的遺傳算法允許有效處理鄰近密碼子對的相互依賴性帶來的約束,同時特別設(shè)計的突變算子有助于克服由于遺傳算法在最初幾個世代后的重組步驟中產(chǎn)生許多不良的可能解的性狀而通常伴隨著遺傳算法的無效(inefficiency),所述突變算子總能改進(jìn)序列品質(zhì)(單個密碼子和密碼子對適合度)的兩個方面之一。合適的密碼子對使用能影響對酶的生產(chǎn),這會在以下的實施例中通過實驗顯示。已經(jīng)制備了要在及^6"fo中表達(dá)的三個基因的密碼子對優(yōu)化的變體,其中一個將與僅適應(yīng)了單個密碼子使用的合成基因比較而另一個將與下述合成基因比較,其經(jīng)歷了優(yōu)化過程(所述過程使用假定為正的權(quán)重的求反)但是仍然以與前文相同的方式(見實施例4和實施例5)針對單個密碼子使用被優(yōu)化。通過這種方式,也對在本文中被否決的Irwin58e"/.(1995)的觀點進(jìn)行檢驗,所述觀點為不足量表現(xiàn)的密碼子刺激翻譯。對于A"Zgw而言,將檢驗上述amyB的密碼子對優(yōu)化的版本并與具有單個密碼子調(diào)和的野生型和合成基因比較,見實施例2和3。2.實施例2:用于構(gòu)建改進(jìn)的DNA序列以改進(jìn)Ape/^7tom'ggr真菌淀粉酶在Am'ger中生產(chǎn)的本發(fā)明的方法的用途下文中,本發(fā)明的方法被應(yīng)用于設(shè)計Am'gw的AmyB(FUA)基因的新穎的核苷酸序列,該序列為了在Aw/gw中改進(jìn)的表達(dá)而在單個密碼子和/或密碼子對使用上被優(yōu)化。該方法可以以相同的方式應(yīng)用于改進(jìn)任何核苷酸序列的密碼子使用。2.1介紹通過密碼子調(diào)和的單個密碼子優(yōu)化的概念先前由本發(fā)明的申請人開發(fā)并在主文本中報道(也見實施例3)。在該實施例中我們顯示了如何將本發(fā)明的方法應(yīng)用于設(shè)計下述基因,所述基因針對單個密碼子和密碼子對使用被優(yōu)化。在該特定的情況下應(yīng)用權(quán)重矩陣,所述權(quán)重矩陣通過應(yīng)用含有14,000個基因的Am'gw全基因組中2%和4%的高表達(dá)基因的兩個子集產(chǎn)生。對于單個密碼子使用而言,該算法使得解趨向于具有表B.l(=表2.1的第三列)定義的同義密碼子-頻率的基因,而對于密碼子對使用而言,其應(yīng)朝向下述最優(yōu)密碼子對的集合被優(yōu)化,所述密碼子對集合的高頻率具有相關(guān)的負(fù)權(quán)重(在表C.2中),是對于其在4%高表達(dá)基因集合中的期望值而言過量表現(xiàn)的密碼子對。注意萬一沒有特定宿主的高表達(dá)基因的確定列表,則人們也可以(i)引用類似的宿主生物的權(quán)重矩陣,例如尸.cA,oge"廳矩陣可以被用于^m'gw,或(ii)應(yīng)用全基因組序列數(shù)據(jù)或其子集產(chǎn)生良好的、但是較不最優(yōu)的權(quán)重矩陣。2.2材料和方法2.2.1編碼Am'ggro;-淀粉酶Jmv^的野生型amyB編碼序列編碼a-淀粉酶蛋白質(zhì)的am;;B基因的DNA序列公開于J.Biochem.Mol.Biol.37(4):429曙438(2004)(MatsubamT.,AmmarY.B.,AnindyawatiT.,YamamotoS.,ItoK.,IizukaM.,MinamiuraN."Molecularcloninganddeterminationofthenucleotidesequenceofrawstarchdigestingalpha-amylasefrom^pe/^7/usmvamoWKT-11.")中,其也可以以登錄號AB083159得自EMBLNucleotideSequenceDatabase(http:〃www.ebi.ac.uk/embl/index.html)。天然Am'geram少B基因的基因組序列作為SEQIDNO.1顯示。aw少B的相應(yīng)編碼或cDNA序列作為SEQIDNO.2顯示。被翻譯的SEQIDNO.2序列被指定為SEQIDNO.3,代表Am'gero;-淀粉酶蛋白質(zhì)AmyB。該序列與Ao^^aece-淀粉酶蛋白質(zhì)也具有100X的相似性(WirselS.,LachmundA.,WildhardtG.,RuttkowskiE.,"Threealpha-amylasegenesofAspergillusoryzaeexhibitidenticalin加n-exonorganization.";Mol.Microbiol.3:3-14(1989,UniProtaccessionnr.P10529,P11763orQ00250)。已對amyBcDNA序列進(jìn)行了根據(jù)本發(fā)明方法的優(yōu)化。2.3設(shè)計步驟經(jīng)優(yōu)化的編碼核苷酸序列SEQIDNO6是運行所述軟件方法的結(jié)果。應(yīng)用的參數(shù)為種群大小=200;迭代數(shù)=1000;c戸'=0.20、CPW矩陣="表C.2.CPW:Aspergillusniger-高表達(dá)的序列"和CR矩陣二"表B.l第4列CR表ANS:Aspergillusniger-高表達(dá)的序列"。另外,對Pw/(CTGCAG)和Notl(GCGGCGCC)位點的每次發(fā)生,對_/^。油.加上+1的罰分值。朝向y^。油的最小值的解趨同在圖13中顯示。針對SEQIDNO.6獲得的目標(biāo)值(objectivevalue)在表2.2中與針對SEQIDNO.2和SEQIDNO.5的值一起給出,圖14解釋了如圖15和16中所示的這些基因的單個密碼子統(tǒng)計,表2.2給出三條序列中密碼子的實際值。圖18-20顯示三個基因變體的單個密碼子和密碼子對二者的統(tǒng)計。該類圖表在圖17及其描述中詳細(xì)解釋。從這些圖中明確了SEQIDNO.5和SEQIDNO.6的單個密碼子統(tǒng)計是高度相似的。然而,本發(fā)明的方法導(dǎo)致下述基因,所述基因具有增加的具有相關(guān)負(fù)權(quán)重(w"g)《O)的密碼子對數(shù)量(93%比74%),還導(dǎo)致了y^從-0.18到-0.34的進(jìn)一步降低,指出具有更多負(fù)權(quán)重與之相關(guān)的密碼子對的更加最優(yōu)的使用。表2.1amyB的密碼子優(yōu)化。AA密碼子最佳密碼子分amyBamyB優(yōu)化的優(yōu)化的布w.t.w丄amyBamyB[#密碼[%密scsc&cp子]碼子[#密碼[#密碼子]/AAJ子]AAlaGCT3811.91618AlaGCC511535.72123AlaGCA01228.600AlaGCG111023.851CCysTGT0777.800CysTGC100222.299DAspGAT362047.61515AspGAC642252.42727EGluGAA2641.73GluGAG74758.399FPheTTT0320.000Phe—TTC1001280.01515GGlyGGT491023.32122GlyGGC351841.91515GlyGGA161023.376GlyGGG011.600HHisCAT042.900HisCAC100457.177IlieATT27725.077lie—ATC731967.92121Ile一ATA027.100KLys一AAA0735.000Lys一AAG1001365.02020LeuTTA012.700Leu一TTG131027.054LeuCTT17410.867LeuCTC381335.11415LeuCTA038.100LeuCTG32616.21211MMetATG10010100.01010NAsn_AAT0311.500Asn一AAC1002388.52626PProCCT36627.388ProCCC64836.41414ProCCA0313.600ProCCG0522.70061QGinCAA0525.000Gln一CAG1001575.02020RArgCGT49110.05ArgCGC51220.055ArgCGA0220.000Arg—CGG0220.000ArgAGA000.000Arg—AGG038.100SSerTCT21410.888SerTCC44924.31617SerTCA0410.800SerTCG141027.054Ser_AGT0410.800Ser_AGC21616.288TThrACT30922.51212Thr一ACC701332.52828Thr一ACA01025.000ThrACG0820.000VValGTT2716.189ValGTC541238.71717ValGTA0412.900ValGTG191032.365wTrpTGG10012100.01212YTyrTAT01131.400TyrTAC1002468.63535表2.2amyB的密碼子優(yōu)化序列類型力4p《0(,0.2)SEQIDNO.2WT0.16520.032937.3%0.090SEQIDNO.5sc優(yōu)化的0.0046-0.176573.9%-0.862SEQIDNO.6sc+cp優(yōu)化的0.0109-0.342092.6%-1.621表2.2中列出的所有三條序列是編碼序列,其翻譯的序列被指定為SEQIDNO.3。3.實施例3:測試用于構(gòu)建改進(jìn)的DNA序列以提供乂幼ergz7/wm'ger真菌淀粉酶在Am'g^中改進(jìn)的生產(chǎn)的本發(fā)明的方法62本發(fā)明的方法在下文中用于改善A的AmyB基因的單個密碼子和密碼子對使用。該方法可以用相同的方式應(yīng)用于任何核苷酸序列的密碼子使用改善和改進(jìn)的表達(dá)。3.1材料和方法3丄1菌株WT1:該菌株被用作野生型菌株。該菌株在保藏號CBS513.88下保藏于CBSInstitute。WT2:該Am'ger菌株是包含編碼葡萄糖淀粉酶的基因(g/aA)缺失的WT1菌株。按照EP0635574Bl中所述,通過使用"MARKER-GENEFREE"途徑,構(gòu)建WT2。在該專利中詳盡地描述了如何缺失CBS513.88基因組中的g/aA特異DNA序列。該步驟產(chǎn)生MARKER-GENEFREEAg/aA重組體Am'gwCBS513.88菌株,其最終完全不具有外來的DNA序列。WT3:該Am'gw菌株是含有突變的WT2菌株,所述突變導(dǎo)致草酸鹽缺陷的Am'ger菌株。通過使用如EP1590444中所述的方法來構(gòu)建WT3。在該專利申請中詳盡地描述了如何篩選草酸鹽缺陷的Am'gw菌株。根據(jù)EP1590444的實施例1和2的方法來構(gòu)建菌株WT3,菌株WT3是EP1590444的突變體菌株22(在EP1590444中命名為FINAL)。WT4:該Am'ger菌株是在三個先后的步驟中包含編碼a-淀粉酶的三個基因(awjB,am少BI禾naw少BII)缺失的WT3菌株。缺失載體的構(gòu)建和這三個基因的基因組缺失在WO2005095624中詳細(xì)描述。WO2005095624中描述的載體pDEL-AMYA、pDEL-AMYBI和pDEL-AMYBII已如EP0635574Bl中所述根據(jù)"MARKER-GENEFREE"途徑使用。上述步驟產(chǎn)生草酸鹽缺陷、MARKER-GENEFREEAg/aA、Aam;/A、Aflw少BI禾卩AamyBII淀粉酶-陰性重組體A"&wCBS513.88菌株,其最終完全不含有外來的DNA序列。因此,與WT1相比,WT4針對a-淀粉酶表達(dá)更加優(yōu)化。3丄2Am'ger搖瓶發(fā)酵如W099/32617的實施例"A搖瓶發(fā)酵"章節(jié)中所述,在20ml預(yù)培養(yǎng)基中預(yù)培養(yǎng)Aw&er菌株。過夜生長后,將10ml該培養(yǎng)物轉(zhuǎn)移63至發(fā)酵培養(yǎng)基l(FMl)中進(jìn)行a-淀粉酶發(fā)酵。在含100ml發(fā)酵液的500ml帶蓋三角瓶中于34'C和170rpm下將發(fā)酵進(jìn)行指定的天數(shù),一般如W099/32617中所述。該FM1培養(yǎng)基每升含有52.570g葡萄糖、8.5g麥芽糖、25g水解酪蛋白、12.5g酵母提取物、1gKH2P04、2gK2S04、0.5gMgS04-7H20、0.03gZnCl2、80.02gCaCl2、0.01gMnS04,4H20、0.3gFeS04.7H20、10mlPen-Strep(Invitrogen,cat.nr.10378-016)、48gMES,用4N112304調(diào)節(jié)至pH5.6。3丄3真菌a-淀粉酶活性為了測定Aw&w培養(yǎng)液中的ce-淀粉酶活性,根據(jù)供應(yīng)商的方案,使用Megazyme谷物a-淀粉酶試劑盒(Megazyme,CERALPHAa-淀粉酶測試試劑盒,產(chǎn)品目錄號K-CERA,2000-2001年)。測量的活性基于存在過量的葡萄糖淀粉酶和a-葡萄糖苷酶時非-還原-端封閉的p-硝基苯基麥芽七糖苷(p-nitrophenylmaltoheptaoside)的水解。形成的p-硝基苯酚的量是樣品中存在的a-淀粉酶的度量。3.2針對編碼Am'gwa-淀粉酶JmvA的野生型amv丑編碼序列構(gòu)建Aspergillus表達(dá)構(gòu)建體野生型amyB的DNA序列已在2.2.1中描述過。為了Am'gw"m少B構(gòu)建體在^^wgz7/w物種中的表達(dá)分析,使用基于pGBFIN的表達(dá)構(gòu)建體將強amyB啟動子用于a-淀粉酶在Am'gw中的過表達(dá)(如W099/32617中所述)。包括PamyB的ATG起始密碼子的"mj^啟動子的翻譯起始序列為5,-GGCATTTATGATG-3,或5,-GAAGGCATTTATG-3,,取決于選擇哪個ATG作為起始密碼子。在下文產(chǎn)生的所有后來的"m;;5表達(dá)構(gòu)建體中,PamyB的該翻譯起始序列已經(jīng)被修飾為5'-CACCGTCAAAATG-3,。在兩端引入適當(dāng)?shù)南拗菩晕稽c,以允許在表達(dá)載體中克隆。天然的"mj;B基因含有'TGA,終止密碼子。在下文制造的所有flm少B構(gòu)建體中,5'-TGA-3,翻譯終止序列被替換為5'-TAAA-3',隨后是限制性位點的5,-TTAATTAA-3,。在5'-端引入Wol位點,在3'-端引入P"cl位點。因64此,包含經(jīng)修飾的基因組amyB啟動子和amyBcDNA的片段被完全合成、克隆,并通過序列分析驗證該序列。用J^ol和消化包含Of-淀粉酶啟動子(其帶有經(jīng)修飾的翻譯起始序列)和amyBcDNA序列(其帶有經(jīng)修飾的翻譯終止序列)的片段,并將其引入經(jīng)和尸ad消化的pGBFIN-12載體(如W099/32617中所述構(gòu)建和設(shè)計)中,得到pGBFINFUA-l(圖21)。通過序列分析驗證引入的PCR片段的序列,并將其序列表示在SEQIDNO.4中。3.3改善用于在Am'g^中表達(dá)的a-淀粉酶編碼序列amvB的單個密碼子使用下文應(yīng)用單個密碼子使用優(yōu)化來改善Am'g^amyB基因的密碼子使用。天然amyB的核苷酸編碼序列顯示為SEQIDNO.2。天然Am'gwamyB基因和合成的經(jīng)優(yōu)化的變體的密碼子使用在下表2.1中給出。對于天然的和經(jīng)單個密碼子優(yōu)化的合成的"m少B基因而言,給出了每個密碼子的精確數(shù)以及每個氨基酸的分布。另外,第三列提供了提出的最優(yōu)分布,其為優(yōu)化的目標(biāo)。對第1組氨基酸而言,僅有一種可能性。第1組由總是由ATG編碼的甲硫氨酸和總是由TGG編碼的色氨酸組成。第2組氨基酸根據(jù)0%或100%的極端頻率進(jìn)行優(yōu)化,該策略是清楚的。第2組AA的所有密碼子被特異地改變?yōu)閮煞N可能的密碼子的最優(yōu)變體。更明確地,對于半胱氨酸,將密碼子TGT替換為TGC;對于苯丙氨酸,將TTT替換為TTC;對于組氨酸,將CAT替換為CAC;對于賴氨酸,將AAA替換為AAG;對于天冬酰胺,將AAT替換為AAC;對于谷氨酰胺,將CAA替換為CAG;對于酪氨酸,將TAT替換為TAC。第3組氨基酸可以由表3.1中指出的若干種密碼子編碼;根據(jù)以下的方法來優(yōu)化以優(yōu)選的密碼子頻率存在的每個密碼子(對于丙氨酸為GCT,GCC,GCA,GCG;對于天冬氨酸為GAT,GAC;對于谷氨酸為GAA,GAG;對于甘氨酸為GGT,GGC,GGA,GGG;對于異亮氨酸為ATT,ATC,ATA;對于亮氨酸為TTA,TTG,CTT,CTC,CTA,CTG;對于脯氨酸為CCT,CCC,CCA,CCG;對于精氨酸為CGT,CGC,CGA,CGG,AGA,AGG;對于絲氨酸為TCT,TCC,TCA,TCG,AGT,AGC;對于蘇氨酸為ACT,ACC,ACA,ACG;對于纈氨酸為GTT,GTC,GTA,GTG):對第3組氨基酸及其編碼密碼子而言,給定的編碼序列中每個可能的密碼子的最優(yōu)出現(xiàn)率的計算是根據(jù)以下的方法進(jìn)行的i.對各自第3組AA,求和給定序列中編碼的殘基總數(shù),見低Al列(表3.1),ii.針對每個AA和編碼該AA的密碼子,用該AA的總數(shù)乘以表2.1中的最優(yōu)密碼子分布,得到粗密碼子分布,這一般可含有小數(shù)(decimalnumber),見第A2列(表3.2),iii.通過去除小數(shù)位(digit)近似(roundoff)粗密碼子分布(ii)的值,得到近似的密碼子分布,見第A3列(表3.2),iv.針對每個AA,求和近似密碼子分布(iii)中表示的AA總數(shù),見第A4列(表3.1),v.通過從給定序列中編碼的殘基總數(shù)(i)減去近似的密碼子分布(iv)中表示的AA總數(shù),計算近似的密碼子分布中每個各自AA的殘基的總丟失數(shù),見第A5列(表3.1),vi.針對每個密碼子,通過減法計算粗密碼子分布(ii)和近似密碼子分布(iii)之間的小數(shù)差,見第A6列(表3.2),vii.針對每個密碼子,用小數(shù)差(vi)乘以表1中的最優(yōu)密碼子分布,得到每個密碼子的權(quán)重值,見第A7列(表3.2),viii.針對每個各自的AA,選擇丟失的殘基量(v),具有最高權(quán)重值(vii)的密碼子的各自量,見第A8列(表3.2),k.編碼多肽的給定序列中最終最優(yōu)密碼子分布的計算如下進(jìn)行針對每個密碼子將近似密碼子分布(iii沐選擇的丟失殘基量(viii)相加,見第A9列(表3.2)。表3.1AA(i)IAlA4A5Ala142402Asp24241166Glu312111Gly443421lie28271Leu637352Pro722211Arg81091Ser937352Thr1040400Val1131292表3.2密碼子A2A3A6A7A8A9Ala_GCT15.96150.960.365116Ala—GCC21.42210.420.014121Ala一GCA0000.00000Ala—GCG4.6240.620.0680AspGAT15.12150.120.043015Asp—GAC26.88260.880.563127Glu—GAA3.1230.120.0310Glu—GAG8.8880.880.65119Gly_GGT21.07210.070.034021Gly_GGC15.05150.050.018015Gly一GGA6.8860.880.14117Gly_GGG0000.00000lie—ATT7.5670.560.15107lie—ATC20.44200.440.321121lie—ATA0000.00000Leu—TTA0000.0000067Leu—TTG4,8140.810.10515Leu_CTT6.2960.290.04906Leu—CTC14.06140.060.023014Leu—CTA000O扁00Leu—CTG11.84110.840.269112Pro—CCT7.9270.920.33118Pro—CCC14.08140.080.051014Pro_CCA0000.00000Pro_CCG0000.00000Arg—CGT4.940.90.44115Arg_CGC5.10.10.0510Arg一CGA000O細(xì)00Arg—CGG000O扁00Arg一AGA0000,00000Arg—AGG000O細(xì)00Ser—TCT7.7770.770.16218Ser—TCC16.28160.280.123016Ser—TCA000O扁00Ser—TCG5.180.180.0250Ser—AGT0000.00000Ser—AGC7.7770.770.16218Thr—ACT12120O扁012Thr_ACC282800.000028Thr_ACA000O扁00Thr_ACG000O扁00Val—GTT8.3780.370.10008Val—GTC16.74160.740.400117Val一GTA000O細(xì)00ValGTG5.8950.890.16916隨后,通過隨機分布原始amyB肽中每個氨基酸的提出量的同義密碼子(表2.1)創(chuàng)建全新的核苷酸編碼序列。得自上述過程的合成的"mj;B序列在SEQIDNO.5中指出。使用CloneManager7程序(Sci.Ed.Central:Scientific&Educational軟件,7.02版)針對有害二級結(jié)構(gòu)可能的發(fā)生來檢驗經(jīng)修飾的編碼序列中的二級結(jié)構(gòu)。3.4針對a-淀粉酶編碼序列"mvB在Am'g^中的表達(dá)根據(jù)本發(fā)明的組合的單個密碼子和密碼子對方法優(yōu)化編碼序列本發(fā)明的方法被用于改進(jìn)Am'ger的amyB基因的編碼序列。得自實施例2所述過程的經(jīng)優(yōu)化的am_yB序列在SEQIDNO.6中指出。使用CloneManager7程序(Sci.Ed.Central:Scientific&Educational軟件,7.02版)針對有害二級結(jié)構(gòu)可能的發(fā)生來檢驗經(jīng)修飾的編碼序列中的二級結(jié)構(gòu)。3.5構(gòu)建用于表達(dá)由實施例3.2和3.3中所述編碼序列編碼的Am'gera-淀粉酶AmvB的經(jīng)修飾的amvB表達(dá)載體pGBFINFUA-1(圖2)的Wol-尸acl片段的DNA序列如SEQIDNO.4所示并包含amyB啟動子和帶有經(jīng)修飾的翻譯起始序列和經(jīng)修飾的翻譯終止序列的野生型amyBcDNA序列。如實施例1.2中所述的DNA序列表示為SEQIDNO.7,所述DNA序列包含ce-淀粉酶啟動子的翻譯起始序列的變體,其與編碼a-淀粉酶的amyB基因的密碼子優(yōu)化的編碼序列組合。如實施例3.3中所述的DNA序列表示為SEQIDNO.8,所述DNA序列包含a-淀粉酶啟動子的翻譯起始序列的變體,其與編碼a-淀粉酶的amyB基因的優(yōu)化的編碼序列組合,所述優(yōu)化根據(jù)本發(fā)明的組合的單個密碼子和密碼子對方法完成。為了在表達(dá)載體中克隆這些經(jīng)修飾的序列變體,用^ol和Pad消化兩個合成的基因片段,并將其引入經(jīng)Wol和尸acl消化的pGBFINFUA-1載體(圖21)的更大片段中,產(chǎn)生變體表達(dá)載體。檢驗正確片段的整合后,將變體表達(dá)構(gòu)建體命名為pGBFINFUA-2和pGBFINFUA-3,如下文表3.3中所述。表3.3:用于在Am'gw中表達(dá)a-淀粉酶的經(jīng)修飾的表達(dá)構(gòu)建體質(zhì)粒名稱SEQID翻譯起始序列編碼序列翻譯終止序列NO69<table>tableseeoriginaldocumentpage70</column></row><table>質(zhì)粒pGBFINFUA-l到pGBFINFUA-3的amyB編碼序列的翻譯的序列是按照SEQIDNO3中所示的氨基酸序列,其代表野生型Ag;-淀粉酶。3.6Am'gwo;-淀粉酶的經(jīng)修飾的pGBFINFUA-表達(dá)構(gòu)建體在Am'gw中的表達(dá)如下所述和根據(jù)圖22中所示策略通過轉(zhuǎn)化將如上所述制備的pGBFINFUA-l、pGBFINFUA-2和pGBFINFUA-3表達(dá)構(gòu)建體引入A中。為了在WT4中引入三個pGBFINFUA-l、-2和-3載體(表3.3),如W098/46772禾卩W099/32617中所述進(jìn)行轉(zhuǎn)化和隨后的轉(zhuǎn)化體選擇。簡言之,分離pGBFINFUA-構(gòu)建體的線性DNA并用于轉(zhuǎn)化Am'ger。在乙酰胺培養(yǎng)基上選擇轉(zhuǎn)化體并根據(jù)標(biāo)準(zhǔn)步驟純化菌落。診斷菌落在glaA基因座上的整合,并使用PCR診斷其拷貝數(shù)。選擇了pGBFINFUA-l、-2和-3構(gòu)建體每一個的具有相似的評估拷貝數(shù)(低拷貝1-3)的十個獨立轉(zhuǎn)化體,并使用轉(zhuǎn)化質(zhì)粒的名稱編號,例如分別為FUA-1-1(對于第一個pGBFINFUA-l轉(zhuǎn)化體而言)和FUA-3-l(對于第一個pGBFINFUA-3轉(zhuǎn)化體而言)。所選擇的FUA-菌株和AWT4被用于在100ml培養(yǎng)基中和如上所述的條件下進(jìn)行搖瓶實驗。3和4天的發(fā)酵后,取樣。測量所有三個不同的AFUA-轉(zhuǎn)化體中o;-淀粉酶的生產(chǎn)。如從圖23中可以看出,根據(jù)本發(fā)明方法的編碼序列的優(yōu)化顯示與稱作單個密碼子優(yōu)化的測試的其它方法相比對AmyB表達(dá)的更高改進(jìn)。這些圖已概括于下表3.4中。表3.4.帶有野生型構(gòu)建體的轉(zhuǎn)化體與帶有經(jīng)修飾的amj;B編碼序列的轉(zhuǎn)化體相比的相對平均a-淀粉酶活性(如從圖23中所概括的)。菌株類型SEQIDNO編碼序列a-淀粉酶活性FUA醒14w.t.100%FUA畫27單個密碼子優(yōu)化的200%FUA隱38根據(jù)本發(fā)明修飾的400%這些結(jié)果清楚地指出本發(fā)明的方法可應(yīng)用于促進(jìn)宿主中的蛋白質(zhì)表達(dá),盡管表達(dá)構(gòu)建體和宿主已具有若干種其它優(yōu)化,例如強啟動子、改進(jìn)的翻譯起始序列、改進(jìn)的翻譯終止序列、最優(yōu)的單個密碼子使用和/或針對蛋白質(zhì)表達(dá)改進(jìn)的宿主。4.實施例4:設(shè)計用于三種異源酶在Sacz7/船物種^cz7/mw&"fo禾口丑flcz'〃Mamz7o//gMe&cZe/xy中表達(dá)的改進(jìn)的DNA序歹ll4丄介紹實施例4描述了本專利所述的本發(fā)明方法的實驗設(shè)計和應(yīng)用,該方法用于兩種物種(在該實施例中更特定地為3fl"7/wsw6"fo和Sacz'〃wa/m'/o/zXac/ms)中異源蛋白質(zhì)的(改進(jìn)的)表達(dá)。一個優(yōu)選的表達(dá)宿主為5acz'〃M5aw//o/z々t/^/2i"'ms。5ac/〃ww&z'to基因組于1997年公布,其它5acz7/w物種隨后公布(Kunst,F(xiàn).a/.1997.ThecompletegenomesequenceoftheGram-positivebacterium5ac"/wsw6"/z's.Nature390:249-56;Rey,M.W."a/.(2004).Completegenomesequenceoftheindustrialbacterium5a"7/船/z'c/iemybnm'sandcomparisonswithcloselyrelated5ac編sspecies.GenomeBiology5:R77;RaskoD.A.etal.(2005》Genomicsofthe5a"'〃uscem/sgroupoforganisms.FEMSMicrobiologyReviews29:303-329)。在該實施例中,選擇及w&Zfo的全序列作為計算單個密碼子頻率和密碼子對權(quán)重的基礎(chǔ)。對GC-含量和tRNA的比較提供了所述^"7/w物種的類似圖譜(見上文)。這表示相同的統(tǒng)計可應(yīng)用于其它相關(guān)的5^z7/w物種。另外,從實施例1(也見圖4)已經(jīng)清楚相關(guān)的物種顯示相似的密碼子對頻率。71在圖4(也見實施例1)中,可以找到基于及對及flm少/o/Z^^^de朋的全基因組統(tǒng)計的密碼子對比較圖表。觀察到兩個數(shù)據(jù)集合之間的良好相關(guān)。另外,似乎及amj/o/^i/e/a"'e似更加通用(versatile),因為有在及awz7o/^we/ac/e/w中被良好接受而在及jw6/z'fo中具有高度負(fù)值的密碼子對組合的子集;但未觀察到相反情況。4.2.實驗設(shè)計選擇三條蛋白質(zhì)序列用于在5a"'〃M禾口5a"'〃M,z7。/—咖C&"S二者中表達(dá)蛋白質(zhì)1:來自萬acz7/wWeara/Aermc^Az7^的木糖(葡萄糖)異構(gòu)酶xylA(EC.5.3丄5);蛋白質(zhì)2:來自^/^tomj;cMo/z'voc/iramogewM的木糖(葡萄糖)異構(gòu)酶xylA0EC.5.3丄5);蛋白質(zhì)3:來自T7^r附o朋aera6a"er的L-阿拉伯糖異構(gòu)酶(EC5.3丄4)。表4.1概述基因構(gòu)建體;選擇蛋白質(zhì)2以進(jìn)一步探究密碼子更廣義的概念o基因蛋白質(zhì)單個密碼子優(yōu)化單個密碼子&正密碼子對優(yōu)化單個密碼子&負(fù)密碼子對優(yōu)化蛋白質(zhì)lSEQIDNO.9SEQIDNO.16SEQIDNO.13蛋白質(zhì)2SEQIDNO.10SEQIDNO.17SEQIDNO.14SEQIDNO.18蛋白質(zhì)3SEQIDNO.11SEQIDNO.12SEQIDNO.15表4.1提供了應(yīng)用于上文所述3個基因的方法。對蛋白質(zhì)l、蛋白質(zhì)2和蛋白質(zhì)3而言,除了之前開發(fā)的單個密碼子優(yōu)化外還應(yīng)用本發(fā)明方法的密碼子對優(yōu)化。作為對照,通過包含蛋白質(zhì)2的2個額外構(gòu)建體實驗,來檢驗單個密碼子優(yōu)化和負(fù)密碼子對優(yōu)化的作用。一個變體(SEQ.ID.18)被設(shè)計為其朝向不良密碼子對被"優(yōu)化"(即負(fù)密碼子對優(yōu)化),第二變體被設(shè)計為僅進(jìn)行單個密碼子優(yōu)化(SEQ.ID.17)。選擇蛋白質(zhì)2是因為&r印tom;;c^物種限制高度差異的密碼子對偏向性,見實施例1和圖4。72所有設(shè)計的及a附j(luò)/o/we/acz'e朋基因避免MM(CATATG)和5awHI(GGATTC)限制性位點的發(fā)生。另外,它們含有去除克隆載體pBHA12的五."http://部分的單個限制性位點。4.3.單個密碼子優(yōu)化蛋白質(zhì)1和蛋白質(zhì)2的單個密碼子優(yōu)化的變體根據(jù)實施例3.3中所述用于單個密碼子優(yōu)化的方法設(shè)計,分別得到SEQIDNO.16和SEQIDNO.17。應(yīng)用的單個密碼子分布表(表4.2)使用50個最高表達(dá)的基因測定,所述最高表達(dá)的基因使用及^M/m168的24AffymetrixGeneChips(其使用6個獨立的發(fā)酵時間系列)測定。所有的GeneChips根據(jù)其算術(shù)平均值被標(biāo)準(zhǔn)化。表達(dá)列表排除了下述基因其在菌株工程操作中被有意地過表達(dá),因此它們測量出的表達(dá)水平不能與其密碼子使用相關(guān)聯(lián)。單個密碼子分布表4.2的測定通過視覺檢査50、100、200、400個最高表達(dá)的序列和所有及wto'fc序列的密碼子頻率直方圖完成。在清楚趨向于0%或100%的最高表達(dá)基因的情況下,分別進(jìn)行0%和100%的指定。對于未指定的其它密碼子而言,計算平均使用并通過省略指定的密碼子對同義密碼子集合進(jìn)行標(biāo)準(zhǔn)化。得到的目標(biāo)單個密碼子頻率在表4.2第3列中給出。表4.2合成基因設(shè)計的密碼子使用分布,其基于50個最高表達(dá)的基因和單個密碼子使用直方圖例如圖24的視覺檢查;在密碼子對優(yōu)化期間可以應(yīng)用不關(guān)注(Don'tcare)的項目,以允許這些密碼子的選擇自由化,從而不考慮這些密碼子的單個密碼子優(yōu)化。單個密碼子分布不關(guān)注=0%關(guān)注=1AAlaGCT500Ala—GCC01Ala一GCA500AlaGCG01CCysTGT510Cys—TGC490DAspGAT631AspGAC371EGluGAA100173GluGAG01FPheTTT550Phe—TTC450GGlyGGT311GlyGGC341GlyGGA351GlyGGG01HHisCAT710HisCAC290IlieATT600He—ATC400lie—ATA01KLys_AAA1001Lys_AAG01LeuTTA390LeuTTG240LeuCTT370Leu_CTC01LeuCTA01Ixu—CTG01MMetATG1001NAsnAAT450Asn—AAC550PProCCT350ProCCC01ProCCA220Pro—CCG430QGinCAA1001GinCAG01RArgCGT380ArgCGC340ArgCGA01ArgCGG01Arg—AGA280Arg—AGG01SSerTCT340SerTCC01SerTCA340Ser一TCG01SerAGT01SerAGC320TThrACT330Thr一ACC01Thr一ACA460Thr—ACG221VValGTT471Val一GTC01Val—GTA231ValGTG301WTrp—TGG1001YTyrTAT620TyrTAC380終止—TGA01終止—TAG01終止—TAA10014.4.密碼子對優(yōu)化根據(jù)本發(fā)明的方法進(jìn)行密碼子對優(yōu)化。優(yōu)化的編碼核苷酸序列SEQIDNO.13-15運行所述的軟件方法的結(jié)果。應(yīng)用的參數(shù)為種群大小=200;迭代數(shù)=1000;=0.20、CPW矩陣="表C.4.CPW:Bacz'〃ww&Zfo-高表達(dá)的序列"和CR矩陣-"表B.l第5歹!j:CR表BAS:5acz7/wwMfo-高表達(dá)的序列"(也在表4.2中)和如表4.2中的"不關(guān)注"元素。另外,對A/^el(CATATG)和BamHI(GGATTC)限制性位點的每次發(fā)生,對y^。i加上+l的罰分值。經(jīng)優(yōu)化的編碼核苷酸序列SEQIDNO.18是運行所述軟件方法的結(jié)果。應(yīng)用的參數(shù)為種群大小=200;迭代數(shù)=1000;—-0.20、CPW矩陣=-1倍的"表C.4.CPW:5"cz7/m^&z'fo-高表達(dá)的序列"(用于獲得朝向不良密碼子對的密碼子對優(yōu)化)和CR矩陣="表B.l第5列CR表BAS:Sa"7/wswMfo-高表達(dá)的序列"(也在表4.2中)和如表4.2中的"不關(guān)注"元素。另外,對A^I(CATATG)和5amHI(GGATTC)限制性位點的每次發(fā)生,對y^。油.加上+l的罰分值。對不顯示密碼子偏向性的這些密碼子選擇表4.2中的"不關(guān)注"元素。這通過視覺檢査單個密碼子偏向性圖完成,見4.3。這類元素的使用對優(yōu)化的密碼子對部分提供了額外的自由度。所有的優(yōu)化朝向y^。油的最小值集中。表4.2中與獲得的SEQIDNO.11、SEQIDNO.16和SEQIDNO.17的目標(biāo)值一起,給出獲得的SEQIDNO.13-15和SEQIDN018的目標(biāo)值。從該數(shù)據(jù)了解到對SEQIDNO.16和SEQIDNO.17而言與SEQIDNO.14和SEQIDNO.15相比單個密碼子統(tǒng)計是高度類似的。然而,本發(fā)明的方法導(dǎo)致下述基因,所述基因具有增加的密碼子對(其具有相關(guān)負(fù)權(quán)重)數(shù)量,指出具有更多正權(quán)重與之相關(guān)的密碼子對的更加最優(yōu)的使用,見表4.3。使用y^p最大值化的"優(yōu)化"導(dǎo)致下述基因,所述基因具有增加的密碼子對(其具有相關(guān)負(fù)權(quán)重)數(shù)量,指出具有更多正權(quán)重與之相關(guān)的密碼子對的提高的使用,因此預(yù)期對翻譯特征的不良影響。對SEQIDNO.18(Mv(g)SO)為24%而對SEQIDNO.14為85%,且^一也從1.20提高至-1.43。表4.3密碼子優(yōu)化;在及w6"'fo和5.amy/o/z々i/e/ac/e似中表達(dá)的基因的目標(biāo)適合度值。序列類型SO(—=0.2)SEQIDNO.11WT0.0780.09741.1%0.350SEQIDNO.13sc+cp優(yōu)化的0.004-0.29389.1%-1.439SEQIDNO.14sc+cp優(yōu)化的0據(jù)-0.29284.8%-1.431SEQIDNO.15sc+cp優(yōu)化的0扁-0.30389.2%-1.493SEQIDNO.16sc優(yōu)化的0.002-0.02356.9%-0.114SEQIDNO.17sc優(yōu)化的0扁0.08744.3%0.428SEQIDNO.18sc+負(fù)cp優(yōu)化的0.0150.25723.5%1.1965.實施例5:針對三種異源酶在^zcz7/mw6"fo和Sfld〃t^awvfo/Z卵e&cZg似中的表達(dá)檢驗本發(fā)明的方法。5.1介紹76實施例5描述了具有這些序列變體的3個異源基因在B""7/m和^"7/mflm,7o/zXflc/ms宿主細(xì)胞二者中表達(dá)的實驗和結(jié)果。變體根據(jù)本發(fā)明的方法制造,如實施例4中所述。5.2材料和方法5.2.1萬a"7/船生長培養(yǎng)基2*TY(每L):胰蛋白胨16g、酵母提取物Difco10g、NaCl5g。5.2.2及sWz'fo的轉(zhuǎn)化培養(yǎng)基2c&z'zz'zgw培養(yǎng)基28gK2HP04;12gKH2P04;4g(NH4)2S04;2.3g檸檬酸三鈉.21120;0.4gMgS04.7H20;加H20至900ml并用4NNaOH調(diào)節(jié)至pH7.0-7.4。加仏0至l升。在120°C高壓滅菌20分鐘。1x5WzZzgw畫//t^培養(yǎng)基向50ml2xSpizizen培養(yǎng)基中添加50mlmilliQ;lml50。/。葡萄糖和lOO)il水解酪蛋白氨基酸(20叫/ml終濃度)。來自非選擇性2xTY瓊脂平板的單個S""7/m菌落(或來自深層冷凍管中的等分式樣)被接種在100ml搖瓶中的10ml2xTY發(fā)酵液中。細(xì)胞于37。C和士250rpm下在培養(yǎng)箱搖床中培養(yǎng)過夜。在600nm處測量OD值,并用lxSpizizen-plus培養(yǎng)基將培養(yǎng)物稀釋至OD,i).l。在37。C和250-300rpm下培養(yǎng)細(xì)胞,直到培養(yǎng)物的006()。為0.4-0.6。用補充有0.5%葡萄糖的lxSpizizen培養(yǎng)基(饑餓培養(yǎng)基)對培養(yǎng)物進(jìn)行l(wèi):l稀釋,并將其在37。C和250-300rpm下孵育90分鐘。在桌上離心機中于4500rpm下將培養(yǎng)物離心10分鐘。去除90%的上清液并將沉淀物懸浮在剩余的體積中。將DNA(最大20nl體積中l(wèi)-5pg)與0.5ml感受態(tài)細(xì)胞在萬能管中混合,并在旋轉(zhuǎn)搖動水浴中在37'C于穩(wěn)定的搖動(j/6)下孵育l小時。將細(xì)胞涂布(20到200^1)在含25卞g/ml卡那霉素的選擇性2xTY瓊脂平板上并在37'C孵育過夜。5.2.3制備無細(xì)胞提取物將得自lml培養(yǎng)物的沉淀物重懸于含10mMThris-HCl(pH7.5)、10mMEDTA、F50mMNaCl、lmg/ml溶菌酶和蛋白酶抑制劑(無EDTA的77蛋白酶抑制劑完全混合物,Roche)的緩沖液A中。將重懸的沉淀物在37。C孵育30分鐘,以進(jìn)行原生質(zhì)體化,并隨后如下進(jìn)行超聲處理30秒,IO微米振幅(3個循環(huán)),循環(huán)之間冷卻15秒。超聲處理后通過離心(10分鐘,4'C下13000rpm)將細(xì)胞碎片離心,澄清的溶胞產(chǎn)物被用于進(jìn)一步分析。5.2.4選擇葡萄糖異構(gòu)酶和L-阿拉伯糖異構(gòu)酶編碼基因,并設(shè)計用于在Bflc/〃Ma,/o/z々"g/^c/ms1和丑acZ〃t/51s"6"fe中表達(dá)的合成基因所選擇的三個酶為1.Sac/〃MWeara/Aermop/n7^木糖異構(gòu)酶(P54272Swissprot);蛋白質(zhì)序列SEQIDNO.9,2.S&e/to附j(luò)caso/z'voc/iramogew&s木糖異構(gòu)酶(P15587Swissprot);蛋白質(zhì)SEQIDNO.10,3.77^廠附0朋^尸0^"^附^/^^"1^-阿拉伯糖異構(gòu)酶(八1582623.1EMBL:以及US2003/012971A1),蛋白質(zhì)SEQIDNO.11,核苷酸SEQIDNO.12。如從上文中看出的,所選擇的酶具有不同的微生物來源。為了在5ac/〃iww6rifc或5acz'〃wam;;/o/z々t/e/acz'era中過量生產(chǎn)這些酶的目的,我們已經(jīng)用使得其適合于在^^7/^物種中表達(dá)的方式優(yōu)化了每個蛋白質(zhì)的核苷酸序列,見實施例4。我們已優(yōu)化了編碼上述酶的核苷酸序列。序列在序列表中以SEQIDNO.13.(Ba"'〃wsWeara//iermo;/n7i葡萄糖(木糖)異構(gòu)酶)、SEQIDNO.14.(5^e//o附j(luò)caso"voc/^omogewes葡萄糖(木糖)異構(gòu)酶)、SEQIDN0.15列出。作為對照,產(chǎn)生了具有單個密碼子優(yōu)化而無密碼子對優(yōu)化的一個變體SEQIDNO.16-17,和具有單個密碼子優(yōu)化與"負(fù)密碼子對優(yōu)化"的一個變體SEQIDNO.18,見實施例4和表4.1。5.3在五.co/,/gflc///^穿梭載體中克隆葡萄糖異構(gòu)酶和L-阿拉伯糖異構(gòu)酶編碼基因并轉(zhuǎn)化進(jìn)^^'肌78為了在5acz犯中表達(dá)選擇的基因,我們使用了pBHA12E.co///Bac7'〃i穿梭載體(圖26)。該載體主要來自于表達(dá)載體pBHA-l(EP340878),其中來自5flcZ〃Mam;;/o/z々"e/acz'e朋的am少Q(mào)基因的啟動子替換了啟動子。pBHA12質(zhì)粒含有兩個多克隆位點(圖26)。所有選擇的和優(yōu)化的基因被合成制造(DNA2.0,MenloPark,CA,U.S.A.)為兩個片段(A和B)。對應(yīng)于基因5'端的A片段被克隆在awj;Q啟動子之后。兩個片段均用特定的限制性內(nèi)切核酸酶位點延長,從而允許在多克隆位點1和2中的直接克隆(見圖27)。片段A的3'端和片段B的5'端通過特有的限制性內(nèi)切核酸酶位點重疊,所述位點允許將載體的Eo^部分切除并在轉(zhuǎn)化萬a"7/w犯6rifo之前連接回去(CBS363.94)。在克隆和轉(zhuǎn)化及犯6"fo的步驟期間,使用五.co"作為中間宿主。選擇pBHA12中的兩步克隆途徑,從而避免在五.co/Z中克隆和繁殖表達(dá)載體時可能的問題。在表5.1中列出了添加至片段A和B的限制性酶識別位點,以及允許回復(fù)連接(backligation)和原樣再建完整功能基因的特有限制性位點。所有的A片段5'端含有M/el位點(識別序列CATATG),其允許將基因作為片段克隆,所述片段精確地在其各自的起始密碼子(ATG)處起始。表5丄對已被添加至基因片段以協(xié)助在pBHA12中克隆的限制性內(nèi)切核酸酶(RE)克隆位點的概述。__基因/RE片段A5,頓3,頓片段Br,一山,,一山5順3頓特有的RE位點(基因中的位置)5.WearaAe/7wop/n7t^GI扁I爭I(496bp)S.o/z'voc/iramogewesGIIMM一IC/fll(372bp)71.ma/Zr纖',ARAA扁IMM一IC/al(708bp)已使用標(biāo)準(zhǔn)的分子生物學(xué)方法(Sambrook&Russell,Mo/eoJwC/om'wg:爿丄aZora/oryM朋認(rèn)/,3n/CSHLPress,ColdSpringHarbor,NY,2001;andAusubel"a/.,Cw/re"/尸ratoco/sz'wMo/ecw/arWo/ogy,WileyInterScience,NY,1995)將5個基因的A和B片段在兩個步驟中分別克隆進(jìn)MCS1和2,如圖27中的SEQIDNO.13所示。轉(zhuǎn)化在Aco//TOP10(Invitrogen)中進(jìn)行,或者在另一步驟中使用甲基化敏感的限制性內(nèi)切核酸酶的情況下,轉(zhuǎn)79化在INV110(Invitrogen)中進(jìn)行。使用小量或中量質(zhì)粒分離試劑盒(分別來自Macherey-Nagel禾卩Sigma),分離針對每個表達(dá)構(gòu)建體的若干個五.氨芐青霉素抗性轉(zhuǎn)化體。通過限制性分析證實pBHA12載體中相應(yīng)的A和B片段的正確連接。在下一步中,用特有的限制性內(nèi)切核酸酶(見表5.1)消化含有基因的A和B片段的pBHA12質(zhì)粒,從而將載體的五.co//部分切除。使用凝膠提取試劑盒(Macherey-Nagel)從瓊脂糖凝膠上分離含有斷裂基因的載體的^""7/m部分,并回復(fù)連接。通過感受態(tài)細(xì)胞轉(zhuǎn)化將連接混合物轉(zhuǎn)化至及w^'fcCBS363.94菌株。使用小量或中量質(zhì)粒分離試劑盒(分別來自Macherey-Nagel和Sigma)分離針對每個表達(dá)構(gòu)建體的若干個及^Z^fo卡那霉素抗性轉(zhuǎn)化體。通過限制性分析針對下述內(nèi)容檢驗表達(dá)構(gòu)建體切除五.co/Z部分后正確的模式和pBHA12載體的Sfl"7/w部分的回復(fù)連接。對于每個構(gòu)建體而言,選擇三個及^Mfo轉(zhuǎn)化體用于分析無細(xì)胞的提取物。5.4檢測在丑acz7/z'中過量生產(chǎn)的酶使用每個構(gòu)建體的三個及s^rito轉(zhuǎn)化體和三個及amy/o/^we/fldew轉(zhuǎn)化體分析無細(xì)胞提取物中相應(yīng)蛋白質(zhì)(葡萄糖或L-阿拉伯糖異構(gòu)酶)的存在。使用2xTY發(fā)酵培養(yǎng)基培養(yǎng)菌株。在(搖瓶中)發(fā)酵24小時時取樣(lml)并制備無細(xì)胞提取物,其中提取緩沖液中含有蛋白酶抑制劑。在SDS-PAGE(Invitrogen)上分析13/^無細(xì)胞提取物。針對若干個轉(zhuǎn)化體檢測對應(yīng)于過表達(dá)的蛋白質(zhì)的預(yù)期Mw的清楚條帶。條帶的視覺比較在表5.2中給出。顯然,本發(fā)明的方法通過使用密碼子對方法改進(jìn)了Sad//^他araAermo//n7ws木糖異構(gòu)酶、*S/rep/o,ceso/ZvocAramogewes木糖異構(gòu)酶和7Terwo朋am^fl"ermaAra""L-阿拉伯糖異構(gòu)酶的蛋白質(zhì)生產(chǎn),即這導(dǎo)致了與WT對照基因或單個密碼子優(yōu)化的變體之任一相比改進(jìn)的蛋白質(zhì)生產(chǎn)。另外,如果與單個密碼子優(yōu)化一起應(yīng)用負(fù)密碼子對優(yōu)化,則未檢測到任何產(chǎn)物。表5.2:三個異源基因在萬a"7/z'中的過表達(dá)。WT:野生型;SC:單個密碼子優(yōu)化;cp:密碼子對優(yōu)化;cp、負(fù)密碼子對優(yōu)化。_及SMMfo80<table>tableseeoriginaldocumentpage81</column></row><table>參考文獻(xiàn)Boycheva,S.,Chkodrov,G.&Ivanov,I.(2003).CodonpairsinthegenomeofEscherichiacoli.Bioinformatics19(8》987-998Gurvich,O丄.,Baranov,P.V.,Gesteland,R.F.,Atkins,J.F.(2005).ExpressionlevelsinfluenceribosomalframeshiftingatthetandemrareargininecodonsAGG—AGGandAGA—AGA.J.Bacteriol.187:4023-4032.Gustafsson,C.,Govindarajan,S.&Minshull,J.(2004).Codonbiasandheterologousproteinexpression.TrendsBiotechnol.22(7):346-353Gutman,G.A.&Hatfield,G.W.(1989).NonrandomutilizationofcodonpairsinEscherichiacoli.PNAS86:3699-3703Gygi,S.P.,Rochon,Y.,Franza,B.R.,&Aebersold,R.(1999).CorrelationbetweenproteinandmRNAabundanceinYeast.Mol.Cel.Biol.19(3):1720-30Hatfield,G.W.&Gutman,G.A.(1992).Codonpairutilization.UnitedStatesPatentNo5,082,767Irwin,B.,Heck,D.&Hatfield,G.W.(1995).Codonpairutilizationbiasesinfluencetranslationalelongationsteptimes.JBiolChem270:22801-22806Karlinetal.(2001).Characterizationofhighlyexpressedgenesoffourfast-growingbacteria.J.ofBacteriology183(17):5025-39.Kunst,F(xiàn).etal.(1997).ThecompletegenomesequenceoftheGram-positivebacteriumBacillussubtilis.Nature390:249-256Lithwick,G.&Margalit,H.(2003).Hierarchyofsequence-dependentfeaturesassociatedwithprokaryotictranslation.GenomeRes.13(12):2665-73.Makrides,S.C.(1996).Strategiesforachievinghigh-levelexpressionofgenesinEscherichiacoli.Microbiol.Rev.60:512-538Moura,G.etal.(2005).ComparativecontextanalysisofcodonpairsonanORFeomescale.GenomeBiology2005,6:R28Nevalainen,K.M.H.,Te,o,V.S丄&Bergquist,P丄.(2005).Heterologousproteinexpressioninfilamentousflmgi.TrendsBiotechnol.200523(9):468-47482Pel,H丄,etal.(2007).GenomesequencingandanalysisoftheversatilecellfactoryAy/erg"/^m'gerCBS513.88.NatBiotech.200721(2):221-231Punt,P丄,vanBiezen,N.,Conesa,A.,Albers,A.,Mangnus,J.&vandenHondel,C.(2005).FilamentousfUngiascellfactoriesforheterologousproteinproduction.TrendsBiotechnol.20(5):200-206Rocha,E.P.C.,A.DanchinandA.Viari(1999》TranslationinBacillussubtilis:rolesandtrendsofinitiationandtermination,insightsfromagenomeanalysis.NAR,27(17):3567畫76.Boycheva,S.,Chkodrov,G.&Ivanov,I.(2003).CodonpairsinthegenomeofEscherichiacoli.Bioinformatics19(8):987-998Schwartz,S.&Curran,J.F.(1997》AnalysesofframeshiftingatUUU-pyrimidinesites.NAR25(10):2005匿2011Spanjaard,R.A.&vanDuin,J.(1988》TranslationofthesequenceAGG-AGGyields50%ribosomalframeshift.PNAS85:7967-7971附錄l:符號和等式列表單個密碼子編碼相同氨基酸的密碼子,(c,)密碼子c,.的出現(xiàn)數(shù)密碼子q的比例(與其同義的相比)密碼子對(c',。)密碼子對的出現(xiàn)率(觀察到的數(shù)量)"血((c,,c,))該密碼子對的預(yù)期數(shù)量":((c,,。X"(c,;Kf(。).z"。J(,0)相應(yīng)的標(biāo)準(zhǔn)差((c,,。))—"邵((c,,c》H1-《"(c,)《"(。))相應(yīng)的標(biāo)準(zhǔn)評分(z-評分)〃、、H。))-((c"。))'",,c,))密碼子對的偏向性系數(shù)L.,,、、"。6,((C,,S))—"邵((C,,。))6zos((c,.,c,))=-^-^-max("血((c,.,。.)),"邵((c,.,。》)組合的"預(yù)期"值(對權(quán)重而言)《;6'((c,,cy))=od).Z"r((q,c,)〕密碼子對權(quán)重-方法,一個序列組(或基因組)84龜》=,,》max^t((c,.,。)),《((c,,。)))密碼子對權(quán)重-方法,高表達(dá)的組與參考組(或基因組)c)卜o(c,,。))-"0r((c,,。.))85附錄2:CR載體表B.l:按列表示的以下生物的CR矩陣值(1)AN:Am'g^全基因組-方法統(tǒng)計分布;(2)ANS:Am'g^250個高表達(dá)基因-方法目檢,(3)AN—d:Am'ge/"關(guān)注-不關(guān)注(O-l)載體;(4)BS:及w6rifc全基因組-方法統(tǒng)計分布;(5)BSS:及50個高表達(dá)基因-方法,目檢,(6)BS—d:及關(guān)注-不關(guān)注(O-l)載體;(7)EC:五.co//全基因組4298seq;-方法統(tǒng)計分布;(8)ECS五.co//來自Carboneetal.(2003)的高表達(dá)組100seq-方法目檢;(9)EC—d:五.co"關(guān)注-不關(guān)注(0-1)載體;(10)BA:及"mj/o/z々we/adms全基因組-方法統(tǒng)計分布;(11)BAS:及flm少o〃z々t/e/fl"'eAw50個高表達(dá)的基因-方法目檢;(12)BS—d:及"w少o〃/《Me/ac/e"j關(guān)注-不關(guān)注(0-l)載體;(13)SC:&cereWWae全基因組-方法統(tǒng)計分布;(14)SCS:cewv&'ae200個高表達(dá)的基因-方法目檢,(15)SC—d:5".cem^z'ae關(guān)注-不關(guān)注(0-l)載體;(16)SCO:&coe//co/oM3(2)全基因組-方法統(tǒng)計分布。注意對于真菌微生物(更特定的P.c/>sogemw7、AOryzae、A^re"51、X.w.c/w/"ra、爿.^m/ga加、Zreesez'、iV",/^c/en'0而言,使用J.m'gw序列衍生的CR載體適用。對于一般酵母(更特定的^cfc和;cw26e)而言,使用cem^'ae序列衍生的CR載體適用。對于^r印tomyces物種而言,使用&coe//co/orA3(2)的CR載體適用。1ANANSAN_d4BSBSS6BS一dEC8ECS9EC—d10BA11BAS12BA—d1AAA3301711001758106910012AAC58100143550731000665003AAG67100129012519031014AAT4201574506600765005ACA2101414601400293006ACC3570116014257121017ACG2201262212600384008ACT233011733018431133009AGA1301272805001420010AGC1821122320272812230011AGG1201100130070112AGT13011101160090113ATA14011401900150114ATC527313540041721424008615ATG100100110010011001000100100116ATT3427151600512814460017CAA4001531001361704260018CAC51100132290428104140019CAG6010014701648305840020CAT490168710581905960021CCA22012022020140110122CCC2964110011300170123CCG240141430498605260024CCT253612935017002040025CGA15011001700120126CGC2551119340383403240027CGG180115011000200128CGT1649118380366601540029CTA100150140040130CTC2438111011000170131CTG2532123014710002820032CTT171712337011002130033GAA42261691001698006640034GAC4964136371376414440035GAG587413101312003460036GAT5136164631633615660037GCA210129500223001830038GCC3251120012700260139GCG211112501342303830040GCT2638126500174701840041GGA241613235112002630042GGC3235133341394214140043GGG190116011500170144GGT2549119311345811630045GTA110121231162901625046GTC35541240121003225047GTG3019126301361902825048GTT2427129471265302325049TAA27100162100159100041100150TAC58100134380427615950051TAG310122013200210152TAT420166620582411650053TCA13012434013001630054TCC23441120114311160155TCG17141100114006140056TCT17211213401541110030057TGA420116019003920058TGC591001544905510001730059TGG100100110010011001000410160TGT4101465104500140161TTA601213901400590162TTC65100130450417716940063TTG1813116240130066100164TTT3501705505923131600ANANSAN—dBABASBA_dECECSEC_dBSBSSBS一d12346"789101112表B.l繼續(xù)13141516171819202122232487SCSCSSC—dSCO1AAA5921152AAC40751%3AAG41791954AAT6025145ACA316126ACC21401657ACG1421318ACT3452129AGA47761110AGC11312511AGG2131412AGT1641313ATA2801214ATC264819615ATG100100110016ATT46581217CAA69901518CAC365919319CAG311019520CAT64411721CCA41741222CCC16514123CCG13015424CCT31231225CGA701326CGC6114727CGG4013928CGT14251629CTA1491030CTC6013631CTG11516032CTT1331233GAA708511534GAC355119535GAG301518536GAT65491537GCA3021438GCC223315839GCG11013640GCT37641241GGA2301742GGC20816443GGG12111944GGT459511045GTA2201346GTC203915547GTG19814048GTT39541249TAA-1001-50TAC437419551TAG-01-52TAT57261553TCA2181254TCC16321418855TCG10512856TCT26481157TGA-01-58TGC381319159TGG100100110060TGT62871961TTA28211062TTC417119863TTG28621264TTT592912SCSCS1SCO131415161718192021222324附錄3:CPW矩陣表C丄CPW矩陣Azergz7^m'ger全基因組(左側(cè)密碼子在第2列中指出,右側(cè)密碼子在第2行中指出)。宿主細(xì)胞A序列數(shù)據(jù)全Am'ger基因組。123456789101112AAAAACAAGAATACAACCACGACTAGAAGCAGGAGT1AAA'0.620.370.380.34-0.160.070.180.29-0.19-0.03-0.140.112AAC-0.17-0.28-0.23-0.050.08-0.25-0.010.030.02-0.300.17-0.043AAG-0.02-0.20-0.24-0.05-0.02-0.04-0.120.03-0.14-0.05-0.18-0.034AAT0.240.450.440.220.160.160.150.060.19o.n0.230.085ACA0.000.090.250.04-0.250.040.140.20-0.25-0.01-0.220.106ACC-0.17-0.35-0.28-0.070.11-0.300.170.01-0.08-0.220.07-0.027ACG-0.180.01-0.06-0.050.000.21-0.140.10-0.250.15-0.360.008ACT0.360.570.530.400.130.170.17-0.220.030.270.150.209AGA-0.010.050.06-0.08-0.180.190.100.11-0.37-0.17-0.30-0.1510AGC-0.28-0.26-0.26-0.250.00-0.060.040.03-0.20-0.31-0.04-0.2111AGG-0.310.08-0.22-0.32-0.170.31-0.250.01-0.390.10-0.52-0.1812AGT0.210.350.470.170.240.370.280.130.210.280.270.0513ATA0.060.250.380.16-0.210.030.080.04-0.100.07-0.070.1714ATC-0.27-0.35-0.31-0.150.03-0.30-0.03-0.040.14-0.100.200.0215ATG0.02-0.06-0.010.090.050.01-0.100.04-0.13-0.14-0.080.0016ATT0.500.550.560.450.340.260.260.160.410.440.460.3817CAA0.270.210.250.10-0.15-0.010.140.20-0.07-0.090.050.0318CAC-0.29-0.25-0.26-0.18-0.05-0.220.01-0.030.11-0.250.32-0.0919CAG-0.17-0.08-0.13-0.11-0.080.09-0.09-0.06-0.08-0.010.02-0.0120CAT0.230.440.460.090.080.180.150.040.420.290.500.2221CCA0.100.160.240.01-0.240.010.120.02-0.120.13-0.110.1722CCC-0.28-0.36-0.37-0.150.09-0.150.03-0.04-0.12-0.170.05-0.0623CCG-0.140.070.06-0.09-0.120.08-0.140.00-0.040.32-0.060.1424CCT0.380.460.480.260.090.180.16-0.020.310.380.330.2325CGA0.170.190.280.12-0.190.060.160.17-0.03-0.140.010.0326CGC-0.24-0.25-0.26-0.22-0.07-0.16-0.160.030.01-0.320.18-0.2727CGG-0.220.130.01-0.15-0.260.15-0.31-0.11-0.240.07-0.19-0.1828CGT0.510.450.670.480.290.240.320.170.630.440.580.4029CTA0.240.260.430.25-0.030.030.330.270.240.120.180.2530CTC-0.23-0.30-0.20-0.110.03-0.240.09-0.020.29-0.170.33-0.0631CTG-0.16-0.12-0.130.040.120.070.020.090.120.010.040.0932CTT0.540.520.640.480.190.230.260.110.560.440.550.3833GAA0.460.270.240.09-0.090.060.090.17-0.23-0.16-0.14-0.1234GAC-0.18-0.21-0.28-0.190.01-0.13-0.030.000.01-0.310.05-0.2335GAG-0.07-0.07-0.23-0.19-0.050.10-0.22-0.03-0.19-0.09-0.31-0.2036GAT0.240.340.360.090.060.120.02-0.010.200.190.270.0137GCA0.080.090.150.00-0.160.070.050.04-0.190.07-0.230.0638GCC-0.28-0.36-0.28-0.230.10-0.11-0.04-0.05-0.05-0.25-0.01-0.2639GCG-0.070.11-0.06-0.08-0.160.15-0.260.00-0.180.28-0.310.0540GCT0.380.600.480.380.160.210.10-0.120.180.310.220.1341GGA0.03-0.09-0.11-0.19-0.210.07-0.09-0.05-0.37-0.20-0.30-0.2842GGC-0.12-0.13-0.19-0.04-0.10-0.12-0.11-0.040.10-0.250.15-0.2743GGG-0.330.21-0.17-0.29-0.150.34-0.26-0.02-0.310.20-0.52-0.2544GGT0.340.310.650.320.240.170.380.030.310.210.45-0.0245GTA0.220.300.380.09-0.130.090.080.120.180.380.070.3946GTC-0.20-0.34-0.33-0.240.14-0.180.00-0.180.18-0.190.15-0.1147GTG-0.010.03-0.060.06-0.010.09-0.22-0.120.030.16-0.270.2448GTT0.530.550.500.380.350.310.13-0.030.430.370.240.2849TAA0.000.000.000.000.000.000.000.000.000.000.000.0050TAC-0.18-0.29-0.24-0.060.13-0.230.000.010.15-0.200.34-0.0651TAG0.000.000.000.000.000.000.000.000.000.000.000.0052TAT0.330.440.410.240.180.140.050.050.320.330.310.2153TCA0.010.220.29-0.03-0.330.020.05-0.09-0.180.11-0.270.1454TCC-0.21-0.30-0.30-0.070.07-0.210.100.00-0.01-0.190.080.1355TCG-0.140.06-0.02-0.11-0.090.00-0.20-0.09-0.070.21-0.210.1056TCT0.440.550.560.370.030.150.17-0.130.220.350.300.3757TGA0.000.000.000.000.000.000.000.000.000.000.000.0058TGC-0.26-0.19-0.25-0.23-0.10-0.10-0.15-0.07-0.09-0.10-0.04-0.2659TGG-0.090.020.05-0.03-0.030.15-0.15-0.04-0.28-0.01-0.29-0.2260TGT0.270.430.610.310.160.250.140.080.390.470.370.3261TTA0.310.390.410.20-0.050.130.110.090.040.24-0.010.1362TTC-0.18-0.30-0.27-0.080.14-0.240.05-0.060.19-0.090.200.0263TTG-0.17-0.12-0.26-0.24-0.17-0.08-0.34-0.29-0.080.08-0.35-0.1264TTT0.540.580.610.460.250.190.090.080.400.410.290.29AAAAACAAGAATACAACCACGACTAGAAGCAGGAGT146"789101112表C.l繼續(xù)131415161718192021222324ATAATCATGATTCAACACCAGCATCCAcccCCGCCT1AAA-0.140.210.190.31-0.14-0.070.16-0.08-0.11-0.040.050.192AAC0.24-0.17-0.090.08-0.02-0.18-0.110.100.05-0.19-0.050.083AAG-0.04-0.13-0.08-0.010.060.00-0.060.080.03-0.090.000.054AAT0.060.080.140.040.060.160.150.020.070.050.100.05ACA-0.110.200.040.20-0.070.010.050.00-0.22-0.020.000.056ACC0.11-0.29-0.14-0.070.09-0.120.080.200.190.020.110.217ACG-0.160.12-0.050.09-0.06-0.02-0.19-0.07-0.030.10-0.150.048ACT0.120.180.260.09-0.01-0.010.04-0.02-0.02-0.16-0.07-0.199AGA-0.310.190.060.09-0.14-0.05-0.05-0.15-0.100.150.100.1410AGC-0.08-0.09-0.20-0.090.00-0.05-0.09-0.120.130.220.130.1611AGG-0.440.19-0.09-0.110.050.200.100.02-0.150.27-0.050.0112AGT-0.200.030.11-0.02-0.040.110.07-0.120.040.07-0.050.0213ATA-0.340.170.100.21-0.140.110.07-0.07-0.20-0.08-0.02-0.0914ATC0.21-0.25-0.190.000.04-0.18-0.020.170.16-0.090.210.1915ATG-0.090.010.000.030.04-0.04-0.030.04-0.010.01-0.020.0216ATT0.240.190.320.16-0.030.070.03-0.010.03-0.19-0.03-0.1617CAA-0.210.140.150.25-0.160.020.22-0.03-0.150.020.100.1418CAC0.21-0.16-0.13-0.02-0.14-0.180.000.140.07-0.010.060.1519CAG-0.09-0.07-0.09-0.070.010.04-0.06-0.04-0.110.040.00-0.0120CAT-0.040.160.150.01-0.060.140.15-0.05-0.07-0.04-0.03-0.1121CCA-0.020.240.090.08-0.050.110.210.08-0.300.08-0.030.1022CCC-0.06-0.29-0.20-0.120.060.160.020.150.190.560.060.2923CCG-0.140.09-0.040.05-0.080.02-0.07-0.10-0.080.18-0.190.0124CCT0.070.200.260.08-0.12-0.170.01-0.19-0.23-0.17-0.20-0.2525CGA0.070.280.180.25-0.15-0.11-0.08-0.04-0.270.05-0.23-0.0326CGC-0.01-0.23-0.20-0.140.070.07-0.04-0.070.230.200.230.3127CGG-0.330.18-0.06-0.030.070.300.170.05-0.230.12-0.14-0.0828CGT0.110.010.310.19-0.08-0.100.02-0.09-0.13-0.22-0.20-0.2029CTA0.030.250.240.32-0.19-0.15-0.04-0.10-0.26-0.08-0.05-0.0630CTC0.06-0.22-0.08-0.030.06-0.030.170.230.14-0.120.310.1331CTG0.050.00-0.020.060.02-0.06-0.120.050.150.090.150.0932CTT0.210.250.390.20-0.12-0.130.04-0.18-0.23-0.34-0.04-0.3333GAA-0.070.110.130.14-0.050.070.170.000.000.090.100.1634GAC0.100.00-0.10-0.03-0.04-0.15-0.12-0.030.16-0.12-0.02-0.0135GAG-0.05-0.01-0.09-0.140.010.03-0.09-0.07-0.090.02-0.09-0.0936GAT-0.010.060.11-0.080.100.220.09-0.020.10-0.02-0.040.0237GCA0.090.300.180.170.040.100.07-0.02-0.180.10-0.10-0.0238GCC0.13-0.27-0.24-0.280.210.000.040.140.280.200.150.2039GCG0.010.230.020.08-0.100.02-0.24-0.23-0.090.06-0.28-0.0740GCT0.120.180.24-0.070.040.04-0.02-0.060.04-0.13-0.01-0.2641GGA-0.110.210.020.070.000.17-0.01-0.01-0.080.16-0.050.0442GGC0.12-0.04-0.14-0.10-0.05-0.15-0.14-0.120.160.070.090.0043GGG-0.300.26-0.08-0.150.160.330.06-0.04-0.150.01-0.20-0.1844GGT0.15-0.130.26-0.030.040.060.11-0.020.11-0.020.06-0.1545GTA-0.040.300.220.20-0.17-0.02-0.04-0.16-0.29-0.04-0.11-0.2146GTC0.13-0.22-0.18-0.230.23-0.020.100.190.330.110.270.1547GTG0.020.19-0.01-0.060.010.03-0.21-0.080.140.10-0.15-0.0748GTT0.140.220.250.000.010.000.04-0.08-0.05-0.21-0.02-0.3249TAA0.000.000.000.000.000.000.000.000.000.000.000.0050TAC0.18-0.17-0.110.040.02-0.15-0.050.130.17-0.020.170.1551TAG0.000.000.000.000.000.000.000.000.000.000.000.0052TAT0.080.130.180.05-0.01o.n0.06-0.04-0.06-0.15-0.15-0.1553TCA-0.240.200.040.070.000.180.140.12-0.280.000.00-0.0254TCC-0.01-0.25-0.110.030.13-0.020.080.160.120.120.090.1855TCG-0.140.120.000.15-0.120.05-0.11-0.12-0.040.04-0.14-0.0956TCT0.080.210.330.21-0.08-0.010.02-0.09-0.15-0.21-0.13-0.2757TGA0.000.000.000.000.000.000.000.000.000.000.000.0058TGC-0.05-0.01-0.14-0.170.04-0.04-0.06-0.050.260.170.060.1559TGG-0.090.090.00-0.090.050.00-0.030.00-0.020.06-0.070.0260TGT-0.020.160.230.12-0.060.150.09-0.01-0.13-0.11-0.27-0.2161TTA-0.250.270.090.18-0.18-0.010.00-0.20-0.32-0.01-0.12-0.1562TTC0.30-0.21-0.13-0.020.04-0.20-0.140.120.20-0.110.11-0.0263TTG-0.23-0.12-0.28-0.260.150.27-0.020.050.210.270.090.0864TTT0.280.210.290.130.100.170.190.070.00-0.09-0.01-0.07ATAATCATGATTCAACACCAGCATCCACCCCCGCCT131415161718192021222324表l繼續(xù)252627282930313233343536CGACGCCGGCGTCTACTCCTGCTTGAAGACGAGGAT1AAA-0.140.130.000.25-0.120.050.150.18-0.15-0.180.07-0.042AAC0.02-0.18-0.06-0.110.11-0.20-0.080.050.110.020.090.293AAG0.130.010.080.070.04-0.06-0.10-0.040.100.09-0.040.04<table>tableseeoriginaldocumentpage92</column></row><table>59TGG0.150.080.170.17-0.140.09-0.05-0.070.010.12-0.01-0.1060TGT-0.23-0.16-0.29-0.09-0.21-0.16-0.19-0.23-0.110.01-0.04-0.1961TTA-0.230.25-0.140.19-0.33-0.01-0.06-0.200.040.130.160.1162TTC0.01-0.22-0.08-0.200.18-0.21-0.010.010.140.040.110.2463TTG0.120.200.090.270.160.250.060.04-0.030.14-0.14-0.1464TTT-0.070.110.090.180.010.000.110.01-0.13-0.20-0.22-0.21CGACGCCGGCGTCTACTCCTGCTTGAAGACGAGGAT252627282930313233343536表C.l繼續(xù)373839404142434445464748GCAGCCGCGGCTGGAGGCGGGGGTGTAGTCGTGGTT1AAA-0.27-0.15-0.040.06-0.15-0.15-0.130.21-0.26-0.120.040.182AAC0.310.040.410.390.260.010.310.270.420.090.300.323AAG0.160.050.000.040.090.010.10-0.030.22-0.030.02-0.024AAT-0.23-0.26-0.35-0.23-0.23-0.26-0.29-0.02-0.13-0.30-0.23-0.255ACA-0.29-0.070.010.06-0.020.06-0.010.34-0.240.100.060.186ACC0.280.160.350.380.250.000.310.110.35-0.070.210.187ACG-0.080.01-0.30-0.010.020.01-0.210.07-0.090.06-0.250.038ACT-0.19-0.17-0.15-0.27-0.23-0.20-0.19-0.28-0.12-0.14-0.02-0.279AGA-0.160.150.010.04-0.220.09-0.190.08-0.350.190.080.1410AGC0.180.150.310.270.160.080.210.160.270.130.170.2011AGG0.000.32-0.08-0.03-0.020.28-0.250.08-0.210.25-0.13-0.1312AGT-0.09-0.05-0.23-0.17-0.29-0.17-0.32-0.30-0.31-0.16-0.31-0.3013ATA-0.160.140.140.130.160.280.120.37-0.280.220.190.2214ATC0.280.040.330.350.170.060.250.150.430.040.240.3515ATG0.020.02-0.04-0.010.01-0.050.050.02-0.040.010.000.0016ATT-0.23-0.30-0.30-0.28-0.28-0.23-0.28-0.20-0.16-0.32-0.26-0.3017CAA-0.23-0.100.04-0.02-0.18-0.20-0.160.11-0.28-0.110.060.0818CAC0.300.030.330.270.21-0.030.320.150.440.020.480.2119CAG0.070.140.000.020.080.080.250.000.060.030.030.0320CAT-0.22-0.11-0.25-0.16-0.12-0.16-0.16-0.04-0.20-0.23-0.10-0.2821CCA-0.250.010.11-0.07-0.110.02-0.030.20-0.270.050.040.1222CCC0.250.030.280.200.160.040.20-0.020.25-0.090.180.1623CCG0.030.13-0.13-0.030.000.16-0.120.10-0.070.17-0.13-0.0524CCT-0.160.01-0.08-0.28-0.17-0.07-0.11-0.22-0.09-0.060.02-0.2525CGA-0.230.08-0.070.08-0.12-0.05-0.010.23-0.250.030.010.1626CGC0.14-0.060.240.310.16-0.010.210.130.230.110.110.3127CGG0.020.17-0.24-0.060.060.250.180.14-0.230.19-0.15-0.0228CGT-0.05-0.23-0.15-0.26-0.22-0.19-0.01-0.44-0.15-0.28-0.13-0.2729CTA-0.26-0.080.110.00-0.19-0.130.060.19-0.25-0.050.180.1430CTC0.31-0.020.400.290.240.100.350.160.33-0.080.320.2431CTG-0.06-0.06-0.13-0.09-0.15-0.05-0.02-0.12-0.04-0.03-0.09-0.0132CTT-0.14-0.10-0.13-0.22-0.230.01-0.18-0.23-0.01-0.180.03-0.2733GAA-0.21-0.14-0.08-0.06-0.17-0.13-0.220.06-0.26-0.09-0.010.0134GAC0.320.170.380.390.290.130.210.260.360.280.250.3035GAG0.140.19-0.040.060.160.17-0.020.040.040.18-0.03-0.0236GAT-0.16-0.17-0.34-0.23-0.18-0.15-0.31-0.07-0.19-0.13-0.25-0.2937GCA-0.300.05-0.02-0.02-0.110.010.000.17-0.210.190.020.1338GCC0.300.100.340.280.270.170.270.000.230.010.080.1739GCG0.130.23-0.230.140.150.14-0.080.07-0.040.26-0.140.0940GCT-0.12-0.13-0.18-0.40-0.18-0.11-0.22-0.37-0.11-0.15-0.15-0.3041GGA-0.060.230.100.08-0.120.11-0.090.09-0.110.260.070.1742GGC0.120.090.240.180.250.050.270.140.250.200.150.2643GGG0.020.33-0.120.010.280.420.280.21-0.230.37-0.20-0.039344GGT-0.14-0.27-0.09-0.40-0.22-0.18-0.07-0.54-0.20-0.32-0.18-0.3945GTA-0.25-0.04-0.010.00-0.090.16-0.080.37-0.310.130.140.2346GTC0.250.100.310.220.170.000.180.040.390.090.250.2747GTG0.030.12-0.08-0.090.040.150.030.16-0.090.15-0.28-0.0948GTT-0.15-0.10-0.25-0.32-0.22-0.15-0.31-0.30-0.01-0.17-0.14-0.3449TAA0.000.000.000.000.000.000.000.000.000.000.000.0050TAC0.350.030.340.330.17-0.070.280.200.420.130.190.3051TAG0.000.000.000.000.000.000.000.000.000.000.000.0052TAT-0.17-0.24-0.38-0.19-0.17-0.17-0.18-0.03-0.11-0.23-0.26-0.2553TCA-0.240.020.140.030.000.17-0.070.31-0.270.090.120.1854TCC0.17-0.080.320.230.210.000.220.100.28-0.070.190.2155TCG0.070.09-0.080.030.070.12-0.060.23-0.120.09-0.110.0656TCT-0.26-0.18-0.19-0.35-0.24-0.14-0.18-0.20-0.10-0.08-0.05-0.2057TGA0.000.000.000.000.000.000.000.000.000.000.000.0058TGC0.180.040.240.200.250.180.14-0.030.190.210.170.2959TGG0.020.11-0.12-0.04-0.070.070.18-0.14-0.200.24-0.12-0.0560TGT-0.06-0.15-0.29-0.22-0.17-0.03-0.21-0.26-0.22-0.21-0.21-0.3161TTA-0.110.150.130.030.000.17-0.150.19-0.280.200.110.0762TTC0.400.120.380.350.300.210.320.160.450.060.300.2963TTG0.090.17-0.15-0.05-0.010.21-0.160.08-0.010.13-0.14-0.1264TTT-0.29-0.33-0.42-0.36-0.29-0.28-0.40-0.27-0.15-0.30-0.33-0.31GCAGCCGCGGCTGGAGGCGGGGGTGTAGTCGTGGTT373839404142434445464748表C.l繼續(xù)495051525354555657585960TAATACTAGTATTCATCCTCGTCTTGATGCTGGTGT1AAA0.00-0.250.00-0.18-0.32-0.22-0.04-0.040.00-0.37-0.34-0.292AAC0.00-0.250.000.060.15-0.09-0.110.120.00-0.19-0.120.16AAG0.000.140.000.140.110.130.040.130.000.270.240.224AAT0.000.300.000.140.040.160.140.100.000.090.190.115ACA0.00-0.140.00-0.21-0.25-0.12-0.06-0.180.00-0.14-0.22-0.146ACC0.00-0.270.000.170.17-0.22-0.010.120.00-0.23-0.110.127ACG0.000.230.000.060.190.360.090.190.000.330.310.348ACT0.000.310.000.21-0.070.02-0.04-0.070.000.030.17-0.019AGA0.00-0.050.00-0.25-0.050.250.150.090.00-0.18-0.18-0.1710AGC0.00-0.230.00-0.130.100.080.100.090.00-0.21-0.21-0.1411AGG0.000.440.00-0.02-0.050.430.120.190.000.390.210.0512AGT0.000.300.000.020.210.440.380.340.000.150.210.1613ATA0.00-0.200.00-0.31-0.51-0.32-0.27-0.370.00-0.21-0.26-0.2914ATC0.00-0.270.000.120.17-0.230.030.100.00-0.10-0.100.1015ATG0.00-0.010.000.020.080.070.020.000.00-0.010.000.0216ATT0.000.410.000.380.130.070.200.130.000.200.310.1317CAA0.00-0.070.00-0.13-0.19-0.100.070.060.00-0.22-0.20-0.2418CAC0.00-0.210.00-0.020.11-0.15-0.060.050.00-0.21-0.150.0819CAG0.000.130.000.00-0.080.140.020.060.000.220.160.1620CAT0.000.290.000.00-0.070.090.020.040.000.170.190.0621CCA0.00-0.070.00-0.09-0.38-0.15-0.13-0.170.00-0.11-0.18-0.1722CCC0.00-0.170.000.230.15-0.160.130.040.00-0.08-0.020.1023CCG0.000.130.00-0.16-0.060.19-0.080.010.000.260.060.2124CCT0.000.110.000.09-0.110.020.05-0.120.00-0.060.16-0.0925CGA0.00-0.060.00-0.16-0.330.07-0.170.030.00-0.16-0.16-0.1126CGC0.00-0.160.000.000.04-0.110.050.060.00-0.14-0.090.0727CGG0.000.120.00-0.20-0.150.18-0.24-0.010.000.200.090.0828CGT0.000.320.000.190.190.160.250.090.000.090.260.0494<table>tableseeoriginaldocumentpage95</column></row><table>14ATC0.17-0.210.180.1415ATG0.220.020.15-0.0316ATT0.230.270.190.4617CAA-0.36-0.11-0.03-0.0718CAC0.26-0.110.11-0.1019CAG0.120.060.140.0920CAT-0.040.090.040.1821CCA-0.320.01-0.20-0.1722CCC0.04-0.12-0.040.0923CCG-0.170.12-0.22-0.0924CCT-0.170.06-0.110.1325CGA-0.24-0.06-0.02-0.0726CGC0.03-0.190.160.0327CGG-0.27-0.13-0.19-0.2028CGT-0.050.140.240.4029CTA-0.28-0.040.13-0.1430CTC0.20-0.290.29隱O.Ol31CTG0.340.150.230.1732CTT0.00-0.010.070.2333GAA-0.190.00-0.08-0.0834GAC0.29-0.10-0.08-0.3335GAG0.230.08-0.03-0.0936GAT0.100.240.030.2537GCA-0.260.02-0.21-0.2038GCC0.13-0.04-0.08-0.1639GCG0.120.24-0.23-0.1140GCT-0.170.08-0.150.1041GGA-0.260.03-0.13-0.1542GGC0.00-0.10-0.09-0.0443GGG-0.330.00-0.37-0.3644GGT0.120.230.280.4345GTA-0.38-0.13-0.13-0.2546GTC0.23-0.190.04-0.1147GTG0.260.28-0.10-0.1748GTT0.220.200.060.3449TAA0.000.000.000.0050TAC0.25-0.150.21-0.1151TAG0.000.000.000.0052TAT0.180.210.150.2553TCA-0.39-0.08-0.13-0.1454TCC0.05-0.120.080.0655TCG-0.010.14-0.020.0356TCT-0.130.02-0.050.1057TGA0.000.000.000.0058TGC-0.10-0.07-0.13-0.1759TGG0.040.040.10-0.0660TGT-0.090.190.010.1561TTA-0.39-0.03-0.14-0.2162TTC0.19-0.240.00-0.0163TTG0.170.23-0.050.0364TTT0.280.290.150.56TTATTCTTGTTT6162636496表C2:CPW矩陣Am'g^高表達(dá)序列(左側(cè)密碼子在第2列中指出,右側(cè)密碼子在第2行中指出)。宿主細(xì)胞AWgW;序列數(shù)據(jù)全A基因組;高表達(dá)的組400條序列。123456789101112AAAAACAAGAATACAACCACGACTAGAAGCAGGAGT1AAA0.930.640.650.920.610.650.670.570.070.61-0.090.752AAC0.44-0.48-0.420.490.57-0.500.16-0.330.33-0.250.600.133AAG0.51-0.42-0.360.480.27-0.460.43-0.18-0.23-0.140.130.234AAT0.600.680.750.650.350.630.440.800.750.360.800.315ACA0.290.450.560.710.220.430.150.660.330.58-0.140.286ACC0.34-0.58-0.530.450.33-0.590.43-0.460.31-0.41-0.200.157ACG0.25-0.020.390.160.560.080.520.470.010.59-0.300.578ACT0.500.810.480.480.660.100.600.20-0.360.260.640.629AGA0.64-0.070.010.270.560.270.630.29-0.320.340.090.2710AGC0.34-0.39-0.440.080.40-0.340.33-0.380.42-0.39-0.300.2011AGG-0.200.24-0.290.17-0.44-0.33-0.440.17-0.040.580.720.3312AGT0.780.560.790.510.280.580.540.450.490.710.890.7513ATA0.020.860.620.580.440.610.530.740.410.680.690.5014ATC0.12-0.53-0.440.290.52-0.580.12-0.030.38-0.140.040.2115ATG0.41-0.21-0.130.440.56-0.250.28-0.16-0.08-0.360.310.2616ATT0.630.770.820.820.660.380.610.470.760.610.750.5817CAA0.700.520.580.01-0.250.440.030.57-0.450.19-0.060.5318CAC-0.13-0.47-0.47-0.040.41-0.55-0.29-0.050.02-0.28-0.310.1619CAG0.29-0.29-0.370.180.45-0.390.49-0.270.15-0.240.220.1320CAT0.740.740.840.550.490.480.610.700.770.770.840.5721CCA0.610.760.310.48-0.330.310.040.410.320.290.280.6622CCC0.57-0.56-0.620.120.38-0.440.25-0.31-0.06-0.51-0.400.2823CCG0.130.080.590.270.330.360.460.190.190.590.840.6124CCT0.530.240.550.43-0.13-0.220.51-0.160.80-0.240.57-0.1825CGA0.730.370.770.77-0.070.380.610.620.720.350.800.5226CGC0.34-0.49-0.56-0.030.51-0.400.26-0.29-0.36-0.47-0.10-0.0827CGG0.200.320.580.480.140.480.110.340.580.510.250.4028CGT0.77-0.210.210.590.11-0.480.090.080.660.480.780.5329CTA0.800.470.710.91-0.070.190.830.51-0.260.120.450.3830CTC0.20-0.46-0.490.140.40-0.440.39-0.070.28-0.320.49-0.2531CTG0.42-0.34-0.220.380.61-0.420.380.080.30-0.24-0.420.2432CTT0.320.400.740.810.42-0.040.43-0.020.700.480.840.5133GAA0.790.310.580.580.100.330.470.640.480.340.010.4934GAC0.37-0.42-0.500.130.34-0.470.46-0.180.23-0.390.50-0.2635GAG0.52-0.37-0.430.280.54-0.430.33-0.400.06-0.380.220.0936GAT0.660.420.660.42-0.160.190.570.310.590.520.230.4437GCA0.240.210.480.420.460.32-0.100.430.200.17-0.350.5838GCC0.38-0.61-0.540.250.19-0.470.11-0.37-0.08-0.440.52-0.1439GCG0.660.260.420.300.500.420.540.240.060.530.630.7340GCT0.380.660.320.780.49-0.200.50-0.27-0.210.190.540.4841GGA0.67-0.01-0.010.400.330.05-0.03-0.020.11-0.360.530.0242GGC0.11-0.38-0.510.330.52-0.300.26-0.05-0.13-0.48-0.23-0.1343GGG0.720.530.610.640.640.440.700.570.830.660.910.5744GGT0.48-0.290.420.480.62-0.530.57-0.350.580.010.780.1545GTA0.790.650.640.580.560.500.750.350.700.590.680.3446GTC0.25-0.60-0.560.390.66-0.550.32-0.220.33-0.33-0.26-0.0247GTG0.630.210.370.710.32-0.220.190.200.220.22-0.230.6748GTT0.600.510.540.640.720.010.560.050.830.410.690.759749TAA0.000.000.000.000.000.000.000.000.000.000.000.0050TAC0.34-0.49-0.450.390.49-0.520.59-0.31-0.42-0.250.680.0851TAG0.000.000.000.000.000.000.000.000.000.000.000.0052TAT0.840.720.810.740.520.480.730.221.000.62-0.170.8053TCA0.400.600.66-0.06-0.640.33-0.430.570.360.33-0.630.3654TCC0.42-0.55-0.570.350.69-0.470.48-0.280.05-0.30-0.280.1355TCG0.18-0.020.13-0.090.130.020.290.240.390.300.070.4356TCT0.700.750.640.740.500.020.21-0.050.220.290.700.8257TGA0.000.000.000.000.000.000.000.000.000.000.000.0058TGC0.57-0.33-0.33-0.170.20-0.460.11-0.31-0.050.39-0.520.4159TGG0.19-0.20-0.060.430.63-0.280.65-0.34-0.500.030.09-0.1260TGT0.200.740.420.500.670.410.810.640.420.771.000.3861TTA0.850.850.450.510.490.690.360.690.410.530.160.5662TTC0.18-0.45-0.350.420.33-0.460.36-0.180.46-0.140.170.0963TTG-0.26-0.020.10-0.050.25-0.21-0.06-0.220.420.300.590.1764TTT0.700.820.820.770.530.490.540.62-0.230.52-0.270.76AAAAACAAGAATACAACCACGACTAGAAGCAGGAGT123456789101112表C.2繼續(xù)131415161718192021222324ATAATCATGATTCAACACCAGCATCCACCCCCGCCTlAAA0.110.750.530.32-0.36-0.090.550.560.610.290.530.802AAC0.76-0.39-0.10-0.080.35-0.44-0.460.400.62-0.560.04-0.153AAG0.63-0.36-0.160.170.47-0.37-0.260.550.53-0.510.23-0.084AAT0.180.600.190.530.790.610.630.150.790.550.770.16ACA0.700.590.370.47-0.150.35-0.170.490.030.050.730.166ACC0.31-0.53-0.41-0.280.37-0.53-0.280.680.62-0.54-0.010.03"7ACG-0.240.580.520.470.530.270.340.270.610.530.530.398ACT0.700.140.370.550.38-0.25-0.220.170.31-0.510.57-0.319AGA0.12-0.390.100.62-0.14-0.260.060.250.500.300.310.0210AGC0.64-0.23-0.26-0.010.43-0.05-0.280.380.58-0.160.15-0.1111AGG-0.600.550.050.750.20-0.390.470.650.150.630.76-0.2112AGT0.550.220.390.450.490.210.230.540.270.290.530.1513ATA-0.150.740.560.620.060.74-0.040.84-0.38-0.070.880.4614ATC0.58-0.43-0.31-0.070.33-0.52-0.290.520.04-0.390.250.0915ATG-0.07-0.160.000.330.10-0.26-0.060.410.22-0.410.280.4316ATT0.840.480.570.280.070.240.340.420.63-0.100.180.0117CAA0.770.300.250.560.21-0.190.440.28-0.32-0.05-0.130.2218CAC0.93-0.49-0.20-0.10-0.37-0.52-0.180.570.38-0.550.39-0.2619CAG-0.08-0.32-0.120.080.20-0.17-0.300.240.49-0.290.43-0.0720CAT0.440.530.300.620.090.490.610.250.260.620.62-0.0821CCA0.620.220.650.570.390.660.680.660.170.26-0.33-0.0822CCC0.17-0.48-0.42-0.340.40-0.57-0.510.540.370.26-0.08-0.0523CCG0.200.420.110.230.340.490.350.570.100.520.260.6924CCT-0.280.340.320.290.25-0.46-0.220.17-0.45-0.410.14-0.3925CGA0.660.440.480.570.730.570.490.490.690.640.41-0.2526CGC0.73-0.09-0.34-0.070.26-0.40-0.450.570.52-0.570.23-0.2027CGG0.240.400.360.480.460.510.710.71-0.140.500.420.7628CGT-0.19-0.610.00-0.010.02-0.55-0.48-0.180.44-0.620.18-0.2929CTA0.330.800.470.57-0.16-0.380.480.37-0.30-0.210.480.0530CTC0.65-0.35-0.18-0.080.50-0.38-0.410.620.58-0.480.35-0.0131CTG0.71-0.40-0.25-0.050.34-0.35-0.200.320.62-0.300.310.2032CTT0.610.440.560.56-0.24-0.16-0.100.39-0.41-0.460.50-0.3798<table>tableseeoriginaldocumentpage99</column></row><table>17CAA0.590.600.420.060.460.400.370.280.230.310.450.2718CAC0.60-0.36-0.01-0.66-0.26-0.46-0.41-0.10-0.12-0.43-0.220.3519CAG0.50-0.330.45-0.510.66-0.44-0.28-0.150.28-0.21-0.35-0.0620CAT0.650.660.470.23-0.300.600.520.760.240.320.290.1921CCA0.780.390.330.570.350.650.100.490.190.450.330.5022CCC0.49-0.310.55-0.69-0.29-0.340.040.090.07-0.32-0.46-0.1723CCG0.660.420.530.710.500.490.24-0.050.51-0.090.450.4524CCT-0.22-0.340.61-0.640.43-0.35-0.41-0.330.14-0.31-0.130.2925CGA0.640.620.460.690.100.730.450.350.340.300.690.7126CGC-0.13-0.450.37-0.620.17-0.30-0.110.350.170.01-0.220.2027CGG0.190.630.760.600.530.780.620.580.560.360.460.2028CGT0.48-0.450.21-0.69-0.16-0.63-0.59-0.470.09-0.62-0.64-0.3329CTA0.45-0.51-0.01-0.25-0.36-0.300.520.550.730.290.390.1730CTC0.69-0.540.07-0.650.70-0.44-0.18-0.280.14-0.29-0.330.0231CTG0.54-0.220.52-0.350.69-0.18-0.010.040.13-0.10-0.180.1332CTT0.580.020.60-0.330.33-0.31-0.24-0.050.26-0.34-0.300.1033GAA0.140.490.630.220.790.250.490.420.24-0.080.300.2234GAC0.52-0.44-0.15-0.58-0.09-0.33-0.10-0.260.32-0.05-0.210.3435GAG0.40-0.320.39-0.580.57-0.42-0.29-0.340.34-0.11-0.340.0536GAT-0.090.440.660.300.610.060.02-0.070.50-0.29-0.210.1337GCA0.390.640.600.670.480.730.520.390.410.350.170.4738GCC0.72-0.340.49-0.560.72-0.36-0.19-0.230.42-0.38-0.25-0.1239GCG0.740.650.520.310.230.260.350.440.37-0.080.360.3440GCT0.59-0.530.48-0.650.62-0.47-0.36-0.370.200.13-0.440.0941GGA0.710.140.62-0.100.640.560.580.480.480.510.300.2542GGC0.33-0.350.60-0.600.080.170.080.290.330.17-0.180.3743GGG-0.130.810.810.150.450.350.720.620.590.410.640.6544GGT0.65-0.270.59-0.64-0.11-0.64-0.54-0.47-0.12-0.58-0.47-0.2645GTA0.770.160.910.660.780.560.620.460.750.710.700.4846GTC0.57-0.520.25-0.620.28-0.40-0.23-0.310.43-0.28-0.230.1547GTG0.66-0.030.520.450.780.060.150.070.360.190.150.5048GTT0.78-0.200.53-0.25-0.07-0.37-0.14-0.100.01-0.48-0.470.0749TAA0.000.000.000.000.000.000.000.000.000.000.000.0050TAC0.57-0.420.23-0.540.38-0.36-0.24-0.210.18-0.28-0.330.2551TAG0.000.000.000.000.000.000.000.000.000.000.000.0052TAT0.570.630.390.56-0.110.22-0.070.480.500.160.260.0953TCA-0.180.680.630.34-0.550.240.460.460.540.270.11-0.0354TCC0.77-0.480.14-0.52-0.21-0.40-0.30-0.010.34-0.54-0.380.0955TCG0.490.200.530.55-0.540.340.100.320.420.200.180.1956TCT-0.02-0.470.43-0.550.42-0.15-0.30-0.320.140.16-0.200.2657TGA0.000.000.000.000.000.000.000.000.000.000.000.0058TGC0.820.050.73-0.59-0.22-0.02-0.040.340.090.02-0.200.1359TGG0.74-0.280.61-0.02-0.43-0.120.050.170.210.11-0.12-0.1060TGT0.410.71-0.19-0.460.09-0.05-0.11-0.090.240.320.16-0.3761TTA0.100.930.430.78-0.050.480.401.000.430.230.850.8162TTC0.67-0.370.43-0.610.04-0.26-0.35-0.150.31-0.17-0.210.2563TTG0.560.630.590.30-0.13-0.010.290.010.370.240.080.2264TTT0.820.640.770.68-0.500.640.680.450.03-0.350.100.45CGACGCCGGCGTCTACTCCTGCTTGAAGACGAGGAT252627282930313233343536表C.2繼續(xù)373839404142434445464748GCAGCCGCGGCTGGAGGCGGGGGTGTAGTCGTGGTT1001AAA0.410.430.350.59-0.150.240.210.760.900.310.570.602AAC0.66-0.300.33-0.040.12-0.320.53-0.220.92-0.400.38-0.033AAG0.49-0.430.52-0.270.38-0.140.85-0.470.73-0.420.29-0.204AAT-0.06-0.110.24-0.130.340.110.360.270.57-0.070.140.15ACA0.530.450.570.540.640.460.450.530.060.550.370.696ACC0.30-0.140.53-0.140.19-0.170.69-0.460.77-0.450.29-0.317ACG0.24-0.120.500.180.410.410.700.290.040.570.180.128ACT-0.18-0.470.05-0.54-0.03-0.300.55-0.610.24-0.380.25-0.309AGA0.160.270.500.470.180.630.820.210.420.540.350.4510AGC0.590.140.58-0.080.53-0.050.71-0.350.730.050.17-0.1211AGG-0.340.090.820.440.570.490.810.39-0.180.210.510.4712AGT0.46-0.260.240.050.350.160.53-0.320.36-0.16-0.17-0.0913ATA0.600.700.790.770.160.860.830.83-0.540.730.520.6914ATC0.61-0.060.62-0.040.400.200.79-0.440.72-0.260.530.1515ATG0.30-0.240.38-0.080.330.060.36-0.330.58-0.230.30-0.1116ATT0.29-0.550.02-0.44-0.03-0.210.39-0.470.00-0.450.07-0.2317CAA-0.020.240.480.370.080.170.680.030.500.340.540.5518CAC0.20-0.340.32-0.27-0.04-0.150.80-0.440.45-0.430.71-0.3519CAG0.33-0.290.27-0.330.32-0.120.75-0.450.57-0.420.16-0.2920CAT-0.420.390.430.450.010.160.520.270.92-0.150.480.2021CCA0.450.570.350.090.35-0.07-0.290.38-0.250.570.540.5522CCC0.57-0.430.57-0.290.09-0.200.64-0.570.12-0.48-0.08-0.3123CCG0.140.220.09-0.170.560.490.640.530.680.430.420.1224CCT0.53-0.260.55-0.440.120.050.60-0.480.59-0.280.11-0.1725CGA0.680.660.750.600.47-0.170.530.700.780.130.470.5826CGC0.22-0.150.65-0.150.440.100.60-0.060.090.140.200.1227CGG0.520.430.390.300.630.580.740.420.250.550.110.2728CGT0.09-0.69-0.02-0.66-0.35-0.430.45-0.820.51-0.73-0.26-0.4629CTA0.53-0.13-0.23-0.32-0.39-0.220.660.730.440.560.75-0.2230CTC0.48-0.220.52-0.070.130.050.78-0.350.67-0.380.38-0.0131CTG0.24-0.280.20-0.260.07-0.070.65-0.320.35-0.350.11-0.0932CTT0.34-0.450.45-0.46-0.32-0.260.58-0.58-0.07-0.330.18-0.2833GAA0.160.030.210.380.300.050.490.310.70-0.160.340.3434GAC0.60-0.120.490.260.40-0.160.69-0.140.590.060.490.0235GAG0.51-0.290.44-0.370.34-0.150.77-0.490.69-0.310.37-0.3236GAT0.53-0.420.09-0.290.04-0.170.06-0.070.36-0.360.15-0.2737GCA0.290.660.420.510.230.300.390.50-0.220.510.370.5338GCC0.61-0.280.42-0.060.540.230.75-0.490.59-0.410.34-0.2839GCG0.460.280.440.480.510.510.740.320.160.450.450.5140GCT0.31-0.530.12-0.58-0.05-0.350.57-0.640.07-0.400.19-0.3641GGA0.140.380.500.420.450.460.800.310.780.500.630.2342GGC0.510.360.750.110.490.280.750.080.640.470.540.2843GGG0.610.780.840.460.600.830.890.780.730.730.740.7744GGT0.39-0.660.32-0.67-0.15-0.330.63-0.780.63-0.66-0.22-0.6945GTA-0.11-0.050.340.730.510.51-0.210.890.660.390.460.7046GTC0.43-0.160.66-0.200.22-0.140.47-0.530.67-0.360.41-0.0447GTG0.590.140.580.210.500.370.710.550.750.220.390.3848GTT-0.13-0.470.22-0.530.02-0.170.41-0.560.57-0.510.06-0.5049TAA0.000.000.000.000.000.000.000.000.000.000.000.0050TAC0.66-0.320.52-0.240.41-0.220.06-0.220.77-0.16-0.100.1551TAG0.000.000.000.000.000.000.000.000.000.000.000.0052TAT0.45-0.260.59-0.130.38-0.140.520.110.36-0.040.20-0.2553TCA0.550.540.160.310.430.820.620.780.440.220.490.5754TCC0.10-0.390.53-0.320.00-0.370.52-0.500.47-0.540.10-0.3655TCG0.210.200.400.130.410.390.660.350.570.370.270.3510156TCT0.20-0.440.26-0.510.08-0.420.06-0.370.33-0.090.33-0.1657TGA0.000.000.000.000.000.000.000.000.000.000.000.0058TGC0.630.140.31-0.430.410.260.16-0.43-0.170.350.580.2359TGG0.350.000.00-0.190.420.120.68-0.45-0.150.49-0.17-0.3160TGT0.50-0.230.25-0.230.210.310.44-0.400.66-0.580.35-0.4661TTA0.680.700.700.450.920.95-0.020.810.360.79-0.141.0062TTC0.77-0.260.66-0.240.460.060.80-0.430.09-0.350.450.0163TTG0.550.290.450.270.470.470.780.000.230.340.330.0164TTT0.19-0.270.01-0.170.06-0.020.55-0.290.74-0.140.230.26GCAGCCGCGGCTGGAGGCGGGGGTGTAGTCGTGGTT373839404142434445464748表C.2繼續(xù)495051525354555657585960TAATACTAGTATTCATCCTCGTCTTGATGCTGGTGT1AAA0.000.050.000.44-0.600.470.210.510.000.06-0.070.162AAC0.00-0.380.000.110.51-0.47-0.17-0.170.00-0.39-0.300.08AAG0.00-0.270.000.430.32-0.380.040.040.00-0.170.030.254AAT0.000.470.000.740.470.660.450.430.000.580.720.65ACA0.000.100.000.360.290.260.350.180.000.31-0.110.266ACC0.00-0.540.000.410.29-0.58-0.30-0.010.00-0.44-0.220.297ACG0.000.110.000.590.180.400.220.460.000.520.030.658ACT0.000.470.000.540.38-0.120.33-0.220.00-0.190.53-0.029AGA0.00-0.270.00-0.510.12-0.310.72-0.190.000.27-0.66-0.1710AGC0.00-0.450.000.29-0.12-0.280.22-0.220.00-0.11-0.40-0.0611AGG0.000.710.000.460.18-0,430.560.300.000.870.490.3612AGT0.000.280.000.330.310.550.700.470.000.300.650.6113ATA0.000.300.00-0.42-0.840.700.67-0.570.000.400.58-0.5414ATC0.00-0.410.000.350.64-0.590.23-0.050.00-0.17-0.32-0.1715ATG0.00-0.220.000.450.34-0.170.390.090.00-0.110.000.2116ATT0.000.550.000.470.690.410.480.480.000.510.620.2517CAA0.000.060.00-0.090.590.190.290.400.000.16-0.04-0.4218CAC0.00-0.470.000.350.17-0.54-0.01-0.330.00-0.50-0.35-0.0419CAG0.00-0.160.000.320.43-0.42-0.05-0.080.00-0.100.030.4520CAT0.000.510.000.450.540.420.380.230.000.660.640.8921CCA0.000.090.000.520.290.250.37-0.050.000.260.450.3822CCC0.00-0.500.000.510.76-0.590.33-0.170.00-0.42-0.330.3723CCG0.00-0.020.000.23-0.330.390.370.190.000.640.381.0024CCT0.000.100.000.520.67-0.120.17-0.060.00-0.45-0.07-0.0725CGA0.00-0.220.000.71-0.430.170.260.670.000.450.550.0826CGC0.00-0.350.000.430.26-0.53-0.27-0.360.00-0.610.000.4027CGG0.000.340.000.560.360.320.560.500.000.510.440.5328CGT0.00-0.390.000.500.26-0.280.220.140.00-0.08-0.020.4329CTA0.00-0.150.000.610.59-0.250.540.570.000.47-0.111.0030CTC0.00-0.410.000.240.11-0.460.27-0.130.00-0.42-0.44-0.1131CTG0.00-0.280.000.31-0.10-0.210.11-0.150.00-0.010.110.2332CTT0.000.200.000.590.25-0.180.03-0.300.00-0.510.320.3633GAA0.00-0.210.000.330.460.070.180.350.000.45-0.130.2634GAC0.00-0.470.000.420.61-0.44-0.12-0.320.00-0.43-0.330.0035GAG0.00-0.200.000.480.56-0.400.20-0.090.00-0.380.100.3636GAT0.000.300.000.560.680.200.440.290.000.620.520.2537GCA0.000.130.000.38-0.110.030.230.210.000.170.100.4538GCC0.00-0.590.000.380.57-0.560.17-0.130.00-0.25-0.33-0.2339GCG0.000.540.000.550.180.520.520.270.000.690.390.6840GCT0.000.440.000.270.49-0.320.26-0.130.00-0.260.27-0.1541GGA0.00-0.250.000.030.23-0.150.190.390.000.020.03-0.2642GGC0.00-0.260.000.310.46-0.370.38-0.280.00-0.22-0.40-0.0743GGG0.000.560.000.120.330.610.400.370.000.680.700.2944GGT0.00-0.160.000.490.75-0.320.510.060.00-0.100.420.3845GTA0.00-0.300.000.580.110.08-0.05-0.520.000.430.450.2246GTC0.00-0.440.000.250.51-0.550.22-0.300.00-0.50-0.300.2647GTG0.000.190.000.480.110.390.460.080.000.030.060.7048GTT0.000.080.000.570.02-0.270.430.160.000.240.420.5249TAA0.000.000.000.000.000.000.000.000.000.000.000.0050TAC0.00-0.430.000.180.23-0.400.04-0.500.00-0.43-0.280.1351TAG0.000.000.000.000.000.000.000.000.000.000.000.0052TAT0.000.610.000.710.840.430.600.270.000.780.650.5753TCA0.000.390.000.41-0.19-0.070.58-0.100.000.340.210.3254TCC0.00-0.470.000.310.27-0.520.47-0.220.00-0.48-0.250.3655TCG0.000.340.000.520.110.220.110.140.000.380.480.7456TCT0.000.170.000.140.22-0.480.02-0.340.00-0.400.120.2857TGA0.000.000.000.000.000.000.000.000.000.000.000.0058TGC0.00-0.320.000.350.30-0.47-0.45-0.470.00-0.30-0.25-0.0859TGG0.00-0.190.000.38-0.24-0.130.280.170.000.110.00-0.1660TGT0.000.340.000.140.610.510.750.490.000.500.560.5061TTA0.000.540.000.36-0.200.560.410.490.000.830.550.7262TTC0.00-0.420.000.330.51-0.490.38-0.190.00-0.08-0.21-0.3563TTG0.000.200.000.090.430.310.430.250.000.790.540.2564TTT0.000.680.000.770.650.270.43-0.090.000.740.590.30TAATACTAGTATTCATCCTCGTCTTGATGCTGGTGT495051525354555657585960表C.2繼續(xù)61TTA62TTC63TTG64TTT1AAA0.720.340.100.502AAC0.79-0.390.000.313AAG0.15-0.270.340.394AAT0.460.650.370.455ACA0.000.140.080.376ACC0.89-0.370.190.547ACG0.810.460.68-0.108ACT0.82-0.090.210.159AGA0.100.210.180.6610AGC0.55-0.12-0.010.2111AGG0.050.52-0.180.7112AGT0.420.370.200.6613ATA0.290.420.710.4214ATC0.50-0.410.240.2915ATG0.52-0.160.350.4516ATT0.850.540.200.5317CAA0.050.11-0.120.3418CAC0.73-0.370.050.0719CAG0.70-0.260.510.4620CAT0.790.340.330.7121CCA-0.070.310.360.2222CCC0.55-0.51-0.320.3223CCG0.410.390.250.4010324CCT0.570.010.370.6625CGA1.000.420.540.4326CGC-0.29-0.440.11-0.2027CGG0.610.300.030.5728CGT1.00-0.370.020.4629CTA-0.64-0.190.530.0030CTC0.65-0.440.100.1831CTG0.68-0.060.250.2532CTT0.490.000.070.4933GAA0.550.130.200.3934GAC0.71-0.410.19-0.1335GAG0.78-0.280.220.4336GAT0.870.440.370.7837GCA0.500.460.130.2638GCC0.82-0.170.030.1739GCG0.460.200.44-0.0740GCT0.49-0.360.110.4641GGA0.31-0.140.180.3642GGC0.76-0.120.120.4343GGG0.530.63-0.050.3544GGT0.94-0.440.520.7545GTA0.570.270.480.6346GTC0.58-0.430.260.2947GTG0.930.100.300.3548GTT0.590.220.390.5149TAA0.000.000.000.0050TAC0.87-0.370.270.1451TAG0.000.000.000.0052TAT0.790.560.580.6553TCA0.090.290.120.4954TCC0.65-0.320.190.3955TCG0.62-0.110.440.3356TCT-0.41-0.320.270.2757TGA0.000.000.000.0058TGC0.30-0.28-0.180.2059TGG0.77-0.140.110.3760TGT-0.140.330.440.3861TTA0.660.790.670.4062TTC0.75-0.370.130.3463TTG0.590.400.330.3964TTT0.770.630.500.80TTATTCTTGTTT61626364104表C.3:CPW矩陣^^7/m全基因組(左側(cè)密碼子在第2列中指出,右側(cè)密碼子在第2行中指出)。宿主細(xì)胞序列數(shù)據(jù)全及wZrito基因組。123456789101112aaaaacaagaatacaaccacgactagaagcaggagt1aaa0.02-0.28-0.110.04-0.28-0.13-0.230.440.16-0.39-0.020.042aac-0.04-0.220.01-0.160.090.100.060.10-0.17-0.63-0.38-0.543aag0.000.330.180.360.430.590.380.500.480.220.280.054aat-0.030.110.130.24-0.08-0.05-0.200.290.610.570.560.325aca-0.22-0.31-0.27-0.16-0.13-0.24-0.13-0.33-0.45-0.30-0.51-0.256acc0.400.190.43-0.130.420.100.61-0.120.050.370.160.027acg-0.060.22-0.200.04-0.180.170.01-0.31-0.370.26-0.22-0.028act0.570.690.540.490.450.530.680.180.360.750.320.469aga-0.31-0.31-0.32-0.18-0.09-0.30-0.370.22-0.44-0.34-0.59-0.2010agc-0.090.05-0.15-0.14-0.07-0.15-0.10-0.20-0.31-0.32-0.59-0.3811agg0.300.520.230.290.350.510.450.43-0.070.44-0.020.2612agt0.100.540.280.280.030.520.030.150.710.870.820.8013ata-0.40-0.15-0.37-0.080.24-0.230.12-0.14-0.110.28-0.430.0614atc-0.15-0.170.07-0.29-0.05-0.130.15-0.21-0.44-0.50-0.47-0.4815atg0.110.08-0.21-0.060.020.14-0.10-0.01-0.31-0.09-0.32-0.1916att0.200.210.310.31-0.020.03-0.030.180.690.850.660.6017caa-0.27-0.43-0.38-0.26-0.10-0.46-0.26-0.02-0.18-0.40-0.18-0.1718cac0.14-0.04-0.04-0.220.26-0.080.100.10-0.32-0.52-0.40-0.4419cag0.470.580.400.520.310.410.160.330.500.650.290.6320cat-0.060.010.020.15-0.120.01-0.090.110.610.660.620.4721cca-0.34-0.49-0.45-0.340.09-0.330.11-0.37-0.51-0.48-0.51-0.5022ccc0.500.350.59-0.090.550.250.69-0.110.140.300.830.1123ccg-0.020.18-0.13-0.12-0.31-0.09-0.03-0.51-0.220.370.390.0724cct0.310.490.330.390.490.450.670.280.550.800.540.5825cga-0.47-0.52-0.32-0.45-0.08-0.21-0.160.01-0.32-0.550.17-0.2226cgc0.310.130.10-0.100.15-0.060.140.27-0.13-0.33-0.15-0.3127cgg0.210.480.210.04-0.250.24-0.43-0.040.380.450.310.4628cgt0.420.610.510.500.240.590.170.360.890.900.900.8529cta-0.36-0.20-0.43-0.270.29-0.360.28-0.06-0.37-0.31-0.47-0.2330ctc0.370.340.27-0.020.34-0.090.50-0.11-0.39-0.38-0.04-0.3331ctg-0.100.03-0.20-0.13-0.26-0.18-0.23-0.410.090.450.100.3932ctt0.670.610.680.550.560.440.610.310.540.740.600.6833gaa-0.09-0.25-0.06-0.03-0.20-0.16-0.210.280.08-0.370.15-0.1534gac-0.05-0.16-0.05-0.210.130.010.18-0.07-0.35-0.65-0.52-0.5835gag0.200.400.170.320.360.570.160.350.340.230.500.1436gat-0.010.070.140.18-0.090.09-0.150.130.720.770.720.5637gca-0.29-0.29-0.24-0.19-0.21-0.32-0.07-0.38-0.46-0.27-0.39-0.3538gcc0.13-0.100.05-0.350.160.120.44-0.31-0.47-0.230.06-0.2539gcg-0.070.30-0.150.04-0.250.210.05-0.26-0.250.440.310.2640gct0.430.440.470.500.330.530.560.130.650.870.760.6141gga-0.33-0.38-0.11-0.24-0.21-0.23-0.36-0.14-0.29-0.45-0.20-0.2642ggc0.300.270.210.110.100.070.14-0.01-0.27-0.30-0.36-0.2743ggg-0.24-0.01-0.13-0.160.100.340.100.220.070.100.21-0.0244ggt0.320.520.510.510.230.620.230.080.810.880.850.8445gta-0.34-0.34-0.37-0.22-0.09-0.23-0.210.05-0.28-0.09-0.400.0046gtc0.000.010.43-0.260.01-0.260.03-0.19-0.55-0.56-0.38-0.4747gtg-0.210.05-0.29-0.16-0.25-0.05-0.170.020.220.550.190.4248gtt0.560.490.490.530.410.360.410.130.680.870.760.6649TAA0.000.000.000.000.000.000.000.000.000.000.000.0050TAC0.01-0.320.10-0.20-0.04-0.03-0.030.02-0.36-0.60-0.49-0.5851TAG0.000.000.000.000.000.000.000.000.000.000.000.0052TAT-0.060.110.090.24-0.050.20-0.120.240.650.750.680.5453TCA-0.19-0.22-0.19-0.22-0.16-0.27-0.04-0.44-0.32-0.23-0.36-0.1654TCC0.250.060.37-0.260.280.060.45-0.15-0.42-0.220.29-0.3055TCG-0.020.32-0.14-0.06-0.050.080.24-0.28-0.030.580.220.2056TCT0.150.270.240.240.230.310.51-0.130.700.860.560.5857TGA0.000.000.000.000.000.000.000.000.000.000.000.0058TGC0.06-0.09-0.14-0.150.21-0.200.050.310.08-0.39-0.34-0.1259TGG0.130.06-0.23-0.04-0.020.29-0.190.15-0.28-0.200.04-0.1360TGT0.010.330.020.050.01-0.26-0.300.200.820.910.830.706iTTA-0.35-0.36-0.37-0.35-0.17-0.33-0.32-0.33-0.26-0.16-0.19-0.2162TTC-0.090.020.02-0.040.12-0.180.29-0.29-0.54-0.49-0.58-0.4663TTG-0.080.10-0.22-0.190.150.040.10-0.180.000.430.110.1664TTT-0.020.030.14-0.010.02-0.160.000.050.490.800.500.57AAAAACAAGAATACAACCACGACTAGAAGCAGGAGT123456789101112表C.3繼續(xù)131415161718192021222324ATAATCATGATTCAACACCAGCATCCACCCCCGCCT1AAA-0.36-0.29-0.140.220.070.140.150.350.070.350.190.452AAC0.170.010.040.060.090.020.320.170.200.170.190.123AAG0.010.320.380.29-0.12-0.37-0.28-0.41-0.20-0.31-0.41-0.424AAT-0.25-0.06-0.030.05-0.230.04-0.02-0.14-0.26-0.27-0.110.065ACA0.160.230.040.360.020.09-0.050.220.080.27-0.030.056ACC-0.05-0.230.19-0.520.00-0.050.11-0.290.290.400.470.367ACG-0.35-0.19-0.35-0.370.050.180.300.24-0.320.05-0.24-0.338ACT0.510.780.640.66-0.34-0.39-0.08-0.330.050.140.29-0.099AGA-0.14-0.09-0.110.110.210.23-0.190.280.440.230.260.3610AGC-0.090.06-0.21-0.02-0.20-0.07-0.20-0.130.350.360.100.3411AGG-0.140.310.210.25-0.13-0.26-0.40-0.27-0.29-0.02-0.33-0.3112AGT-0.220.24-0.06-0.080.080.430.180.01-0.21-0.310.00-0.0713ATA0.020.680.560.72-0.190.04-0.090.01-0.05-0.13-0.170.0514ATC-0.11-0.24-0.19-0.260.310.380.380.240.200.110.280.3215ATG0.090.020.00-0.040.27-0.04-0.220.020.260.10-0.11-0.0216ATT0.220.030.030.07-0.23-0.25-0.10-0.13-0.21-0.29-0.12-0.0817CAA-0.21-0.33-0.30-0.140.640.760.700.740.690.730.870.8118CAC0.260.050.050.050.150.000.170.00-0.010.070.170.1319CAG0.450.420.450.19-0.31-0.44-0.48-0.43-0.40-0.44-0.46-0.4720CAT0.04-0.03-0.02-0.06-0.05-0.03-0.090.01-0.10-0.260.00-0.0221CCA-0.02-0.22-0.320.070.580.540.650.640.400,710.870.6522CCC0.420.150.360.06-0.03-0.350.34-0.110.130.690.44-0.0223CCG0.12-0.12-0.18-0.36-0.24-0.09-0.14-0.15-0.40-0.24-0.22-0.3524CCT0.430.570.520.46-0.08-0.02-0.09-0.12-0.050.080.230.0325CGA-0.15-0.37-0.38-0.230.680.600.590.570.640.790.710.8526CGC0.27-0.050.180.08-0.06-0.29-0.32-0.310.28-0.07-0.120.0127CGG-0.05-0.03-0.15-0.300.00-0.14-0.12-0.250.17-0.22-0.23-0.2028CGT0.330.220.340.210.140.240.320.23-0.24-0.43-0.32-0.2829CTA-0.170.200.120.380.540.650.600.480.750.760.800.7930CTC-0.04-0.10-0.11-0.240.210.370.440.170.22-0.090.150.1031CTG0.29-0.25-0.22-0.21-0.23-0.14-0.040.01-0.25-0.12-0.32-0.2632CTT0.470.530.580.48-0.25-0.42-0.23-0.30-0.20-0.29-0.18-0.32106<table>tableseeoriginaldocumentpage107</column></row><table>17CAA0.490.730.680.780.430.810.820.850.140.02-0.110.1918CAC-0.19-0.53-0.27-0.460.100.210.310.110.270.360.140.1219CAG-0.39-0.43-0.52-0.45-0.36-0.50-0.58-0.58-0.030.04-0.10-0.1920CAT0.100.250.130.130.030.100.260.26-0.06-0.14-0.17-0.0621CCA0.600.670.730.600.530.690.770.720.250.07-0.020.2522CCC-0.48-0.62-0.11-0.570.210.190.610.400.250.08-0.17-0.1723CCG-0.12-0.22-0.15-0.22-0.26-0.55-0.33-0.520.210.360.280.3124CCT0.450.380.630.040.150.290.650.24-0.33-0.32-0.34-0.4025CGA0.530.650.680.700.510.640.590.660.360.280.050.3026CGC0.09-0.51-0.23-0.510.100.09-0.060.140.580.290.240.3527CGG-0.010.360.160.210.12-0.37-0.51-0.41-0.04-0.04-0.14-0.0628CGT0.850.760.850.48-0.03-0.30-0.05-O.U-0.27-0.34-0.17-0.3329CTA0.850.820.760.720.150.720.780.700.190.24-0.110.1630CTC0.08-0.320.07-0.300.420.280.490.380.350.100.49-0.0631CTG0.10-0.12-0.18-0.130.09-0.32-0.44-0.310.300.390.050.2732CTT0.10-0.310.02-0.210.06-0.140.340.11-0.35-0.09-0.38-0.4433GAA-0.230.20-0.110.330.290.480.270.58-0.02-0.07-0.130.1634GAC-0.38-0.54-0.45-0.510.270.280.320.250.360.390.250.2535GAG-0.13-0.21-0.49-0.34-0.39-0.35-0.60-0.630.100.080.17-0.2336GAT0.160.490.330.390.040.060.170.10-0.18-0.18-0.09-0.1337GCA0.170.190.360.260.160.150.240.240.00-0.05-0.020.1438GCC-0.01-0.46-0.18-0.380.400.310.470.270.430.370.180.2039GCG0.100.210.10-0.01-0.11-0.39-0.37-0.480.130.240.140.2140GCT0.680.640.730.220.030.150.530.03-0.31-0.36-0.29-0.3541GGA-0.11-0.050.060.030.250.300.270.27-0.14-0.28-0.040.0642GGC0.00-0.08-0.16-0.310.330.460.320.140.410.300.180.2743GGG-0.09-0.13-0.160.02-0.17-0.36-0.53-0.560.050.210.270.1244GGT0.710.790.790.670.140.020.34-0.20-0.37-0.38-0.28-0.2845GTA0.320.320.300.370.140.220.300.51-0.06-0.12-0.180.1246GTC0.20-0.320.18-0.180.470.450.500.500.490.460.370.1847GTG0.180.12-0.070.00-0.10-0.37-0.55-0.420.180.270.140.2548GTT0.210.010.16-0.02-0.05-0.210.180.04-0.35-0.31-0.27-0.3549TAA0.000.000.000.000.000.000.000.000.000.000.000.0050TAC-0.33-0.42-0.39-0.400.050.340.420.360.290.290.190.1651TAG0.000.000.000.000.000.000.000.000.000.000.000.0052TAT-0.110.320.080.29-0.120.110.390.36-0.18-0.160.04-0.0653TCA0.240.280.260.39-0.140.080.340.160.03-0.07-0.140.0754TCC-0.07-0.45-0.11-0.410.120.270.510.170.550.460.340.3055TCG0.390.370.230.10-0.36-0.38-0.31-0.470.260.330.210.1856TCT0.400.600.650.35-0.19-0.160.41-0.12-0.25-0.20-0.15-0.3057TGA0.000.000.000.000.000.000.000.000.000.000.000.0058TGC-0.49-0.53-0.45-0.470.130.27-0.020.350.500.430.140.3959TGG0.220.10-0.080.36-0.04-0.28-0.41-0.170.080.16-0.14-0.0860TGT0.840.830.640.77-0.31-0.210.000.03-0.29-0.36-0.37-0.3161TTA0.490.640.430.650.150.550.580.660.030.10-0.100.0762TTC-0.33-0.50-0.26-0.53-0.010.240.320.280.440.480.530.4463TTG-0.23-0.25-0.26-0.25-0.22-0.32-0.47-0.350.270.230.110.1564TTT0.430.420.380.43-0.08-0.060.370.39-0.22-0.18-0.07-0.17CGACGCCGGCGTCTACTCCTGCTTGAAGACGAGGAT252627282930313233343536表C.3繼續(xù)373839404142434445464748GCAGCCGCGGCTGGAGGCGGGGGTGTAGTCGTGGTT108<table>tableseeoriginaldocumentpage109</column></row><table>56TCT-0.11-0.160.13-0.310.440.580.560.23-0.31-0.400.09-0.4957TGA0.000.000.000.000.000.000.000.000.000.000.000.0058TGC0.23-0.010.000.36-0.49-0.38-0.35-0.170.330.390.370.5859TGG-0.060.29-0.190.08-0.230.090.000.320.060.33-0.300.0560TGT-0.29-0.05-0.230.080.690.890.800.63-0.46-0.44-0.36-0.0261TTA0.250.210.260.270.100.12-0.17-0.070.290.340.250.3262TTC0.20-0.050.450.16-0.33-0.44-0.33-0.400.330.250.560.0863TTG-0.150.31-0.03-0.130.390.480.340.570.480.330.05-0.0664TTT0.01-0.28-0.090.040.350.190.370.330.07-0.31-0.06-0.07GCAGCCGCGGCTGGAGGCGGGGGTGTAGTCGTGGTT373839404142434445464748表C.3繼續(xù)495051525354555657585960TAATACTAGTATTCATCCTCGTCTTGATGCTGGTGT1AAA0.00-0.180.000.06-0.190.23-0.210.400.00-0.09-0.030.082AAC0.00-0.080.00-0.090.230.090.310.270.00-0.29-0.25-0.253AAG0.000.110.000.080.340.390.180.270.000.170.07-0.104AAT0.000.010.000.110.060.270.090.280.000.300.260.305ACA0.000.000.000.12-0.12-0.03-0.010.060.000.050.000.006ACC0.000.110.00-0.170.30-0.09-0.20-0.130.00-0.140.42隱O.Ol7ACG0.00-0.030.00-0.070.210.240.390.180.000.050.100.538ACT0.00-0.120.000.07-0.32-0.150.02-0.380.00-0.04-0.37-0.489AGA0.000.180.000.380.360.450.030.500.000.14-0.160.4410AGC0.00-0.130.00-0.180.230.24-0.020.180.00-0.31-0.39-0.2911AGG0.000.230.000.260.250.35-0.170.420.000.190.100.3012AGT0.000.440.000.16-0.220.21-0.050.140.000.820.480.4113ATA0.00-0.240.00-0.03-0.12-0.23-0.38-0.060.00-0.30-0.45-0.2114ATC0.000.380.000.250.350.330.450.400.000.220.280.2015ATG0.00-0.190.000.120.140.08-0.070.050.00-0.090.000.1216ATT0.00-0.150.00-0.16-0.16-0.250.00-0.320.00-0.03-0.01-0.0917CAA0.00-0.270.00-0.21-0.18-0.28-0.39-0.040.00-0.35-0.19-0.2118CAC0.000.280.000.090.24-0.070.130.100.00-0.21-0.070.0719CAG0.000.400.000.270.250.130.210.240.000.390.240.4720CAT0.00-0.080.00-0.07-0.08-0.27-0.02-0.060.000.170.04-0.0821CCA0.00-0.300.00-0.18-0.28-0.17-0.06-0.390.00-0.36-0.38-0.1622CCC0.000.420.000.000.13-0.160.280.230.00-0.130.710.2223CCG0.000.320.000.330.210.300.440.070.000.370.290.4324CCT0.00-0.150.00-0.30-0.21-0.40-0.24-0.350.00-0.19-0.20-0.3325CGA0.00-0.160.00-0.200.250.03-0.010.250.00-0.050.15-0.0126CGC0.00-0.250.00-0.27-0.06-0.280.040.000.00-0.54-0.47-0.5627CGG0.000.340.000.240.240.51-0.060.430.000.560.590.4228CGT0.00-0.140.00-0.25-0.42-0.42-0.50-0.450.000.590.630.3529CTA0.00-0.370.00-0.37-0.28-0.37-0.47-0.200.00-0.50-0.53-0.5130CTC0.000.260.000.170.220.280.340.210.000.280.28-0.1631CTG0.000.110.000.110.06-0.05-0.01-0.020.000.290.180.6032CTT0.00-0.060.00-0.13-0.09-0.29-0.07-0.440.00-0.150.11-0.1133GAA0.00-0.160.000.05-0.070.08-0.220.280.00-0.150.00-0.0334GAC0.000.280.000.180.500.340.360.380.00-0.24-0.21-0.1135GAG0.000.130.000.050.280.500.150.290.000.180.000.2936GAT0.00-0.080.00-0.12-0.12-0.15-0.110.010.000.200.150.0537GCA0.00-0.040.000.150.00-0.23-0.08-0.010.000.01-0.110.0038GCC0.000.080.00-0.080.310.160.330.070.00-0.390.11-0.2939GCG0.000.290.000.260.390.410.530.290.000.400.350.3411040GCT0.00-0.250.00-0.27-0.41-0.45-0.25-0.480.000.18-0.24-0.1041GGA0.000.100.000.10-0.050,790.070.420.000.130.270.3142GGC0.00-0.200.00-0.170.220.110.180.170.00-0.39-0.46-0.4743GGG0.000.290.000.240.420.600.390.410.000.280.460.4744GGT0.000.250.00-0.10-0.51-0.11-0.39-0.410.000.760.700.6345GTA0.00-0.350.00-0.08-0.29-0.32-0.39-0.270.000.01-0.36-0.3346GTC0.000.480.000.210.410.360.440.440.00-0.040.290.0447GTG0.00-0.100.00-0.020.230.260.230.210.000.360.110.4248GTT0.000.070.00-0.12-0.14-0.40-0.01-0.370.00-0.120.03-0.2149TAA0.000.000.000.000.000.000.000.000.000.000.000.0050TAC0.000.000.000.050.290.080.050.180.00-0.10-0.05-0.2051TAG0.000.000.000.000.000.000.000.000.000.000.000.0052TAT0.00-0.020.00-0.02-0.09-0.14-0.010.090.000.070.030.1153TCA0.000.180.000.300.08-0.090.13-0.100.000.260.100.3254TCC0.00-0.100.00-0.29-0.01-0.130.14-0.170.00-0.400.20-0.3255TCG0.000.240.000.160.290.300.510.140.000.380.250.5556TCT0.00-0.080.00-0.11-0.30-0.27-0.12-0.400.000.120.10-0.1757TGA0.000.000.000.000.000.000.000.000.000.000.000.0058TGC0.00-0.080.00-0.050.090.19-0.280.250.00-0.31-0.32-0.4159TGG0.00-0.100.000.060.150.32-0.03-0.010.000.060.00-0.0760TGT0.000.270.00-0.03-0.30-0.17-0.300.070.000.860.560.4561TTA0.000.030.00-0.15-0.20-0.20-0.21-0.050.00-0.34-0.29-0.1062TTC0.000.130.000.210.01-0,050.04-0.070.00-0.09-0.09-0.1963TTG0.000.220.000.170.170.120.27-0.010.000.070.220.4964TTT0.00-0.060.00-0.09-0.11-0.10-0.05-0.170.000.140.05-0.01TAATACTAGTATTCATCCTCGTCTTGATGCTGGTGT495051525354555657585960表C.3繼續(xù)61TTA62TTC63TTG64TTT1AAA0.34-0.270.310.262AAC0.22-0.270.24-0.063AAG0.63-0.100.72-0.124AAT-0.09-0.13-0.070.255ACA0.060.00-0.220.166ACC0.12-0.15-0.08-0.307ACG0.150.070.58-0.108ACT-0.110.13-0.030.199AGA0.480.440.290.4810AGC0.260.210.10-0.0111AGG0.460.290.680.2912AGT-0.270.38-0.260.1013ATA-0.45-0.39-0.48-0.1014ATC0.350.210.490.2815ATG0.74-0.020.760.0116ATT-0.47-0.32-0.450.0417CAA0.11-0.120.14-0.1718CAC0.090.05-0.110.1719CAG0.660.270.680.1620CAT-0.310.02-0.43-0.0921CCA-0.22-0.22-0.18-0.2822CCC0.380.420.100.2223CCG0.400.230.67-0.0411124CCT-0.270.150.010.0525CGA-0.24-0.34-0.22-0.2426CGC0.330.040.180.0527CGG0.410.410.720.0028CGT-0.36-0.40-0.37-0.4529CTA-0.49-0.30-0.57-0.2830CTC0.350.320.040.2331CTG0.190.100.49-0.1932CTT-0.050.25-0.330.4133GAA0.20-0.040.020.0334GAC0.120.200.070.1835GAG0.520.220.58-0.1136GAT-0.33-0.10-0.40-0.1037GCA-0.050.18-0.20-0.0238GCC0.350.260.140.0539GCG0.210.340.56-0.0940GCT-0.36-0.08-0.32-0.1941GGA-0.010.10-0.140.0342GGC0.270.080.110.1343GGG0.190.430.40-0.0244GGT-0.40-0.13-0.46-0.3445GTA-0.24-0.36-0.36-0.2546GTC0.530.320.420.2947GTG0.140.170.45-0.1548GTT-0.250.00-0.240.0949TAA0.000.000.000.0050TAC-0.070.08-0.050.1351TAG0.000.000.000.0052TAT-0.43-0.21-0.500.0353TCA-0.130.02-0.200.0154TCC0.09-0.110.02-0.2455TCG-0.170.100.56-0.2856TCT-0.330.09-0.090.0957TGA0.000.000.000.0058TGC0.230.130.090.1059TGG0.700.010.83-0.0160TGT-0.310.17-0.33-0.2261TTA-0.17-0.09-0.20-0.1562TTC-0.22-0.260.15-0.1463TTG0.24-0.050.58-0.2464TTT-0.44-0.25-0.360.30TTATTCTTGTTT61626364112<table>tableseeoriginaldocumentpage113</column></row><table>49TAA0.000.000.000.000.000.000.000.000.000.000.000.0050TAC-0.21-0.66-0.040.14-0.430.420.23-0.46-0.31-0.510.50-0.6851TAG0.000.000.000.000.000.000.000.000.000.000.000.0052TAT0.030.370.280.460.360.54-0.300.310.761.001.000.7653TCA-0.24-0.490.290.220.12-0.23-0.11-0.66-0.36-0.320.24-0.5354TCC0.59-0.030.31-0.340.410,360.49-0.42-0.54-0.341.00-0.2855TCG0.200.56-0.300.210.360.730.840.46-0.071.00-0.15-0.0556TCT-0.15-0.23-0.070.16-0.210.460.52-0.250.791.001.000.3157TCA0.000.000.000.000.000.000.000.000.000.000.000.0058TGC-0.17-0.180.300.24-0.051.00-0.280.33-0.44-0.381.000.0059TGG0.18-0.22-0.300.220.130.55-0.33-0.04-0.35-0.470.04-0.3160TGT-0.09-0.160.430.07-0.21-0.180.28-0.181.001.001.001.0061TTA-0.33-0.52-0.32-0.14-0.190.40-0.32-0.48-0.31-0.110.460.0462TTC-0.38-0.39-0.060.49-0.380.33G.07-0.19-0.55-0.62-0.46-0.5463TTG-0.23-0.12-0.180.000.270.230.09-0.440.330.750.310.4664TTT0.16-0.120.340.110.13-0.270.310.160.590.880.750.73AAAAACAAGAATACAACCACGACTAGAAGCAGGAGT123456789101112表C.4繼續(xù)131415161718192021222324ATAATCATGATTCAACACCAGCATCCACCCCCGCCT1AAA0.42-0.44-0.150.14-0.04-0.120.370.570.040.460.210.202AAC0.66-0.46-0.330.09-0.30-0,150.04-0.14-0.240.53-0.150.133AAG0.920.460.390.21-0.19-0.51-0.33-0.39-0.040.84-0.36-0.494AAT0.54-0.110.380.24-0.01-0.180.330.30-0.08-0.08-0.080.36ACA0.70-0.18-0.120.23-0.14-0.280.060.24-0.340.33-0.110.046ACC0.080.080.38-0.320.47-0.330.450.630.510.660.930.687ACG-0.42-0.27-0.26-0.390.310.220.160.40-0.510.590.00-0.348ACT0.540.950.580.74-0.47-0.62-0.34-0.36-0.610.660.580.049AGA0.41-0.100.390.240.61-0.290.200.150.481.00-0.160.4310AGC0.37-0.190.01-0.22-0.16-0.36-0.180.360.140.700.070.2411AGG0.350.570.610.600.700.570.380.160.531.000.180.3812AGT0.170.540.230.120.730.430.440.300.370.340.32-0.0313ATA0.810.800.930.710.150.330.050.130.160.64-0.080.1114ATC0.15-0.51-0.36-0.200.050.260.340.12-0.200.640.070.4415ATG0.55-0.130.00-0.030.19-0.21-0.170.13-0.110.440.11-0.1816ATT0.66-0.030.180.21-0.22-0.48-0.040.19-0.250.47-0.19-0.0617CAA0.21-0.49-0.32-0.140.590.900.820.610.560.810.890.8318CAC1.00-0.11-0.14-0.330.04-0.570.09-0.05-0.211.000.04-0.0419CAG0.700.500.490.31-0.26-0.42-0.53-0.42-0.500.04-0.46-0.4920CAT0.56-0.030.080.050.130.32-0.160.18-0.320.51-0.070.1821CCA0.75-0.50-0.230.08-0.02-0.370.79O.卯0.121.001.000.1322CCC1.001.000.631.000.35-0.211.000.170.071.001.000.3923CCG0.57-0.38-0.28-0.43-0.390.050.50-0.12-0.42-0.17-0.32-0.4724CCT1.000.560.600.64-0.330.21-0.15-0.260.421.000.50-0.1325CGA-0.20-0.20-0.24-0.070.411.000.700.800.541.000.801.0026CGC0.850.12-0.24-0.20-0.23-0.26-0.43-0.12-0.330.55-0.10-0.1127CGG0.040.10-0.09-0.270.370.490.260.000.161.000.640.2628CGT0.67-0.36-0.20-0.24-0.62-0.360.03-0.29-0.55-0.03-0.53-0.6329CTA1.000.18-0.01-0.010.421.001.000.560.671.000.861.0030CTC0.46-0.03-0.13-0.200.550.610.600.320.280.690.250.2431CTG0.24-0.40-0.27-0.22-0.110.540.190.19-0.300.86-0.240.0232CTT0.750.370.610.65-0.47-0.58-0.22-0.46-0.360.42-0.27-0.6011433GAA0.35-0.34-0.140.100.270.630.130.370.350.910.380.4634GAC0.53-0.080.080.07-0.29-0.340.38-0.16-0.150.460.230.0035GAG0.560.220.360.10-0.19-0.58-0.38-0.43-0.28-0.09-0.53-0.5636GAT0.30-0.07-0.04-0.080.130.23-0.100.14-0.050.49-0.15-0.0637GCA0.59-0.19-0.220.34-0.150.15-0.250.35-0.170.82-0.07-0.0738GCC0.45-0.080.23-0.190.230.110.430.070.530.750.500.6939GCG0.13-0.35-0.25-0.390.02-0.370.51-0.03-0.270.42-0.28-0.2540GCT0.450.420.490.46-0.42-0.330.220.00-0.150.790.15-0.4741GGA0.82-0.030.240.15-0.20-0.54-0.050.170.080.41-0.160.0942GGC0.35-0.190.03-0.02-0.04-0.060.220.130.240.640.010.3943GGG0.540.390.25-0.130.10-0.330.010.05-0.060.25-0.41-0.4744GGT0.19-0.43-0.41-0.070.110.08-0.020.610.060.670.12-0.1245GTA0.620.150.350.740.14-0.19-0.140.29-O.Ol0.520.08-0.0446GTC0.57-0.37-0.34-0.370.520.610.650.620.830.460.740.4447GTG0.57-0.11-0.060.140.350.220.020.16-0.020.46-0.46-0.3248GTT0.35-0.360.280.25-0.44-0.51-0.37-0.40-0.240.180.03-0.3949TAA0.000.000.000.000.000.000.000.000.000.000.000.0050TAC-0.12-0.43-0.260.32-0.12-0.060.240.04-0.250.370.290.0251TAG0.000.000.000.000.000.000.000.000.000.000.000.0052TAT0.30-0.050.190.10-0.040.05-0.01-0.03-0.260.49-0.170.2153TCA0.760.06-0.210.25-0.21-0.010.130.32-0.200.120.16-0.2754TCC0.56-0.26-0.20-0.350.78-0.380.430.550.240.460.560.8355TCG1.00-0.190.37-0.380.02-0.310.280.280.040.32-0.40-0.4856TCT0.17-0.140.140.31-0.53-0.460.29-0.32-0.611.000.11-0.4057TGA0.000.000.000.000.000.000.000.000.000.000.000.0058TGC0.49-0.160.090.63-0.22-0.50-0.37-0.19-0.230.320.16-0.0659TGG-0.230.050.000.040.46-0.29-0.320.200.341.000.00-0.3560TGT-0.18-0.59-0.100.400.690.390.360.700.610.17-0.01-0.3561TTA0.90-0.38-0.200.070.370.550.120.220.500.640.780.3462TTC0.59-0.23-0.160.02-0.51-0.58-0.37-0.41-0.550.27-0.36-0.6563TTG0.60-0.280.06-0.220.00-0.13-0.150.360.250.10-0.200.2364TTT0.230.020.09-0.050.310.490.470.410.270.660.370.68ATAATCATGATTCAACACCAGCATCCACCCCCGCCT131415161718192021222324表C.4繼續(xù)252627282930313233343536CGACGCCGGCGTCTACTCCTGCTTGAAGACGAGGAT1AAA-0.20-0.38-0.04-0.20-0.170.490.500.26-0.13-0.26-0.060.252AAC-0.22-0.550.04-0.70-0.610.17-0.31-0.460.13-0.100.390.183AAG0.30-0.030.07-0.50-0.39-0.31-0.66-0.680.25-0.120.38-0.054AAT0.330.530.73-0.200.330.290.250.03-0.19-0.27-0.020.12ACA1.000.040.42-0.020.360.370.17-0.08-0.200.050.050.206ACC0.190.430.750.130.340.560.45-0.140.570.370.330.21"7ACG0.520.440.560.23-0.36-0.17-0.23-0.490.270.550.680.258ACT0.19-0.471.00-0.830.020.570.73-0.02-0.42-0.67-0.45-0.439AGA-0.34-0.090.600.190.780.700.350.72-0.07-0.270.030.4410AGC0.49-0.60-0.08-0.330.090.30-0.04-0.240.340.310.400.5011AGG1.000.73-0.310.690.39-0.08-0.440.240.530.320.34-0.2612AGT0.450.490.320.710.010.780.07-0.12-0.51-0.54-0.51-0.4413ATA-0.030.361.000.630.61-0.040.590.670.430.560.460.1514ATC0.07-0.630.36-0.620.070.360.48-0.41-0.10-0.010.310.1715ATG0.86-0.130.41-0.320.670.18-0.40-0.430.02-0.12-0.040.0816ATT0.150.010.74-0.390.22-0.140.52-0.02-0.14-0.20-0.04-0.0811517CAA0.340.54O.卯0.380.590.940.910.860.040.21-0.300.1818CAC0.07-0.54-0.56-0.710.180.820.15-0.38-0.110.230.260.0819CAG0.32-0.52-0.58-0.54-0.46-0.49-0.44-0.680.050.180.25-0.2920CAT0.770.360.440.150.400.650.500.460.14-0.21-0.230.0521CCA0.45-0.020.66-0.321.000.640.580.660.080.05-0.260.1322CCC1.00-0.771.000.381.000.230.820.640.460.530.320.8223CCG-0.300.340.27-0.51-0.43-0.39-0.20-0.620.250.330.510.2824CCT1.000.330.33-0.350.190.290.610.05-0.40-0.39-0.33-0.4525CGA1.000.731.000.700.410.740.390.630.540.180.790.4426CGC0.20-0.66-0.12-0.76-0.27-0.32-0.38-0.080.43-0.130.100.2227CGG1.000.510.360.810.270.36-0.33-0.220.500.300.340.3728CGT1.000.290.44-0.64-0.55-0.40-0.05-0.60-0.52-0.62-0.63-0.1929CTA1.001.001.001.00-0.260.100.860.860.030.33-0.160.0230CTC0.70-0.110.26-0.230.700.740.570.500.530.110.67-0.1231CTG0.72-0.110.48-0.270.44-0.27-0.22-0.430.490.450.220.4832CTT-0.22-0.720.13-0.60-0,11-0.300.01-0.50-0.53-0.40-0.43-0.5433GAA-0.02-0.030.22-0.030.240.330.330.50-0.11-0.27-0.200.0834GAC-0.42-0.74-0.37-0.670.230.260.310.020.050.290.240.1735GAG0.22-0.53-0.21-0.61-0.48-0.46-0.50-0.700.300.220.430.1536GAT0.590.430.650.570.200.100.23-0.10-0.11-0.310.060.0737GCA0.87-0.380.61-0.190.260.190.36-0.060.12-0.070.200.2038GCC0.830.110.47-0.550.670.630.450.590.390.630.430.3139GCG0.32-0.170.67-0.40-0.04-0.25-0.37-0.590.140.120.190.2640GCT0.410.320.82-0.54-0.350.380.54-0.16-0.43-0.56-0.31-0.2741GGA0.26-0.510.42-0.480.610.430.41-0.04-0.11-0.270.15-0.0342GGC0.33-0.300.05-0.57-0.160.420.34-0.120.340.120.200.3543GGG0.390.150.260.240.500.23-0.16-0.450.400.330.660.3644GGT0.820.580.89-0.25-0.35-0.320.63-0.65-0.49-0.59-0.44-0.0545GTA0.58-0.320.36-0.19-0.430.140.23-0.17-0.11-0.15-0.340.0646GTC0.68-0.460.81-0.460.560.350.660.540.560.510.600.3147GTG0.370.040.61-0.090.41-0.43-0.51-0.470.470.280.380.3548GTT0.28-0.400.47-0.540.06-0.200.58-0.27-0.47-0.43-0.30-0.3549TAA0.000.000.000.000.000.000.000.000.000.000.000.0050TAC-0.17-0.63-0,25-0.70-0.330.200.19-0.230.160.420.29-0.2351TAG0.000.000.000.000.000.000.000.000.000.000.000.0052TAT0.350.220.680.100.200.070.500.30-0.15-0.170.040.1453TCA1.000.200.85-0.040.34-0.140.180.000.120.14-0.020.2154TCC1.00-0.600.18-0.40-0.170.650.580.090.690.560.340.2655TCG0.430.740.65-0.17-0.35-0.11-0.37-0.500.600.890.490.1956TCT0.72-0.151.00-0.39-0.430.450.43-0.29-0.40-0.49-0.31-0.3457TGA0.000.000.000.000.000.000.000.000.000.000.000.0058TGC-0.15-0.63-0.30-0.600.210.30-0.32-0.130.610.30-0.190.5259TGG0.69-0.010.240.170.62-0.41-0.30-0.160.05-0.14-0.090.0960TGT1.001.001.001.000.040.150.60-0.28-0.20-0.16-0.43-0.4261TTA0.650.670.680.340.650.470.640.500.070.350.200.4062TTC-0.22-0.60-0.29-0.77-0.300.11-0.27-0.440.24-0.150.640.3663TTG1.00-0.040.04-0.24-0.24-0.31-0.33-0.450.420.300.370.3464TTT0.640.660.850.610.460.400.580.41-0.27-0.270.230.09CGACGCCGGCGTCTACTCCTGCTTGAAGACGAGGAT252627282930313233343536表C.4繼續(xù)373839404142434445464748GCAGCCGCGGCTGGAGGCGGGGGTGTAGTCGTGGTT116<table>tableseeoriginaldocumentpage117</column></row><table>TCT-0.050.03-0.18-0.650.400.450.54-0.45-0.56-0.480.14-0.5657TGA0.000.000.000.000.000.000.000.000.000.000.000.00foJO/八oou丄or\,f-u.JJw乂,-u.jw-ar\,-0.15-0.280.470.470.5259TGG-0.180.13-0.100.260.05-0.340.500.41-0.180.54-0.270.0060TGT0.04-0.23-0.16-0.091.000.740.640.52-0.29-0.48-0.330.2261TTA0.010.550.190.510.300.260.29-0.07-0.080.180.610.1062TTC-0.050.010.160.04-0.38-0.49-0.39-0.65-0.170.040.57-0.3663TTG-0.360.490.29-0.050.620.530.750.710.270.620.450.2564TTT0.110.13-0.22-0.010.440.430.650.350.00-0.120.190.01GCAGCCGCGGCTGGAGGCGGGGGTGTAGTCGTGGTT373839404142434445464748表C.4繼續(xù)495051525354555657585960TAATACTAGTATTCATCCTCGTCTTGATGCTGGTGTlAAA0.00-0.210.000.10-0.350.10-0.040.110.00-0.23-0.06-0.192AAC0.00-0.460.00-0.15-0.28-0.120.68-0.450.00-0.52-0.39-0.413AAG0.00-0.120.000.170.230.500.620.180.000.660.150.594AAT0.000.380.000.280.410.510.500.570.000.730.500.675ACA0.00-0.040.000.19-0.32-0.180.180.010.00-0.370.050.056ACC0.000.280.000.290.390.630.770.540.00-0.250.401.007ACG0.00-0.330.000.000.340.020.310.390.000.010.281.008ACT0.00-0.580.000.35-0.38-0.320.07-0.540.000.00-0.50-0.389AGA0.000.540.000.690.820.84-0.190.700.000.43-0.10-0.0310AGC0.00-0.390.00-0.17-0.050.490.52-0.030.000.27-0.32-0.3211AGG0.000.370.001.001.001.00-0.110.720.001.000.621.0012AGT0.000.840.000.39-0.181.00-0.53-0.280.001.000.710.3513ATA0.000.800.00-0.050.570.030.260.270.000.390.551.0014ATC0.00-0.050.000.330.280.220.250.020.000.590.40-0.1115ATG0.00-0.210.000.150.03-0.350.320.130.00-0.180.000.2616ATT0.00-0.470.000.17-0.10-0.230.27-0.440.00-0.34-0.30-0.1617CAA0.00-0.360.00-0.24-0.090.00-0.46-0.410.00-0.26-0.29-0.3918CAC0.00-0.010.00-0.200.070.041.000.100.000.45-0.12-0.0119CAG0.000.290.000.480.490.530.29-0.260.000.540.420.4420CAT0.000.000.000.12-0.270.170.70-0.100.00-0.180.070.0121CCA0.00-0.560.00-0.04-0.140.530.70-0.480.00-0.52-0.18-0.4022CCC0.000.660.001.000.180.50-0.210.690.001.000.571.0023CCG0.000.090.000.48-0.010.590.35-0.480.001.000.290.6424CCT0.00-0.530.00-0.120.060.530.41-0.430.00-0.45-0.32-0.5525CGA0.000.390.000.010.770.151.000.730.00-0.330.27-0.6326CGC0.00-0.420.00-0.27-0.301.001.00-0.460.00-0.68-0.61-0.5327CGG0.000.440.000.500.570.480.670.840.001.000.780.4428CGT0.00-0.590.00-0.53-0.670.09-0.51-0.800.001.000.801.0029CTA0.00-0.480.00-0.180.260.66-0.41-0.320.00-0.41-0.60-0.5130CTC0.000.140.000.310.021.000.810.440.000.250.181.0031CTG0.000.330.00-0.10-0.180.370.65-0.120.000.300.480.7932CTT0.00-0.590.00-0.16-0.21-0.50-0.19-0.660.00-0.520.230.3633GAA0.00-0.340.000.32-0.160.21-0.140.140.000.400.13-0.2534GAC0.00-0.200.000.180.310.310.270.140.00-0.51-0.380.1735GAG0.000.000.00-0.090.220.470.330.450.00-0.41-0.220.4936GAT0.00-0.280.000.19-0.06-0.190.590.230.000.380.350.1637GCA0.00-0.090.000.140.04-0.270.080.090.00-0.41-0.180.2138GCC0.000.530.000.280.570.710.250.690.00-0.050.280.3639GCG0.00-0.010.000.340.280.540.610.180.000.310.280.5011840GCT0.00-0.500.00-0.29-0.26-0.53-0.36-0.640.000.41-0.23-0.4941GGA0.00-0.100.000.120.040.720.380.210.000.480.340.6242GGC0.00-0.160.00-0.100.470.170.590.080.00-0.42-0.47-0.5943GGG0.000.260.000.350.630.870.830.490.000.400.580.2744GGT0.000.340.00-0.24-0.650.17-0.17-0.730.001.000.530.3645GTA0.00-0.640.00-0.17-0.38-0.18-0.07-0.550.00-0.32-0.11-0.3746GTC0.000.590.000.530.610.790.280.600.00-0.190.230.1047GTG0.00-0.080.000.410.180.500.460.290.000.510.150.7048GTT0.00-0.330.00-0.03-0.12-0.230.26-0.530.000.21-0.21-0.3349TAA0.000.000.000.000.000.000.000.000.000.000.000.0050TAC0.00-0.130.000.210.41-0.070.09-0.440.00-0.17-0.09-0.4951TAG0.000.000.000.000.000.000.000.000.000.000.000.0052TAT0.000.040.00-0.08-0.06-0.13-0.180.160.000.570.05-0.04530.00.0.310.000.5了0.320.2g0.38-0.190.00-0.29-0.030.4354TCC0.00-0.370.000.290.260.33-0.120.020.000.141.000.4855TCG0.000.500.000.640.690.720.640.290.000.450.401.0056TCT0.00-0.420.00-0.19-0.38-0.350.47-0.540.00-0.38-0.38-0.4957TGA0.000.000.000.000.000.000.000.000.000.000.000.0058TGC0.000.130.00-0.200.11-0.18-0.03-0.010.00-0.551.00-0.2659TGG0.000.200.00-0.090.890.810.26-0.310.00-0.340.000.6360TGT0.000.650.00-0.12-0.380.02-0.200.080.001.00-0.551.0061TTA0.000.340.000.33-0.29-0.270.23-0.310.00-0.540.051.0062TTC0.00-0.570.00-0.13-0.460.07-0.24-0.570.00-0.18-0.52-0.3363TTG0.000.160.000.600.340.780.580.110.00-0.27-0.450.3364TTT0.000.250.000.270.250.350.590.110.000.230.510.07TAATACTAGTATTCATCCTCGTCTTGATGCTGGTGT495051525354555657585960表C.4繼續(xù)61TTA62TTC63TTG64TTT1AAA0.32-0.440.420.382AAC0.12-0.580.11-0.103AAG0.70-0.020.66-0.034AAT0.32-0.140.520.665ACA0.01-0.33-0.320.046ACC0.07-0.040.360.347ACG0.30-0.340.43-0.068ACT-0.340.300.040.569AGA0.430.310.270.6810AGC0.42-0.290.33-0.1311AGG0.530.531.000.6712AGT-0.120.360.190.3913ATA-0.10-0.41-0.350.4214ATC0.05-0.360.270.3615ATG0.80-0.410.860.3316ATT-0.41-0.49-0.300.3517CAA0.16-0.330.10-0.0518CAC-0.40-0.43-0.250.1819CAG0.670.060.630.2720CAT-0.37-0.18-0.430.1921CCA-0.04-0.60-0.42-0.3022CCC0.781.000.440.6123CCG0.32-0.080.710.0111924CCT-0.090.220.040.3525CGA0.55-0.371.00-0.1526CGC-0.05-0.48-0.30-0.0927CGG0.630.860.880.1528CGT-0.48-0.71-0.51-0.1229CTA-0.47-0.69-0.49-0.1530CTC0.54-0.18-0.140.5331CTG0.39-0.310.50-0.0432CTT-0.220.02-0.190.6633GAA0.37-0.330.260.1634GAC0.08-0.380.000.3435GAG0.460.140.670.0936GAT-0.09-0.26-0.410.1437GCA-0.09-0.300.040.1338GCC0.530.15-0.070.2039GCG0.10-0.050.70-0.1640GCT-0.40-0.15-0.120.1641GGA0.17-0.23-0.010.3542GGC0.18-0.460.110.2543GGG0.870.530.590.5244GGT-0.37-0.50-0.52-0.2345GTA-0.37-0.69-0.34-0.1446GTC0.740.240.430.2947GTG0.360.300.620.0148GTT-0.140.05-0.150.3049TAA0.000.000.000.0050TAC0.08-0.280.03-0.1151TAG0.000.000.000.0052TAT-0.22-0.25-0.500.3253TCA-0.430.070.21-0.1254TCC0.390.09-0.300.5155TCG0.350.170.50-0.2056TCT-0.29-0.490.020.3957TGA0.000.000.000.0058TGC-0.17-0.470.490.1159TGG0.51-0.180.630.1160TGT0.510.23-0.360.2861TTA0.34-0.220.490.1662TTC-0.41-0.67-0.23-0.1863TTG0.33-0.490.71-0.1364TTT-0.29-0.05-0.050.58TTATTCTTGTTT61626364<table>tableseeoriginaldocumentpage121</column></row><table><table>tableseeoriginaldocumentpage122</column></row><table>33GAA-0.25-0.040.200.180.070.250.330.160.36-0.030.090.0234GAC0.630.190.290.370.45-0.07-0.080.040.380.35-0.090.4235GAG-0.32-0.22-0.310.06-0.43-0.33-0.29-0.29-0.420.65-0.24-0.1836GAT0.33-0.26-0.15-0.10-0.03-0.19-0.070.190.10-0.23-0.07-0.1137GCA-0.330.24-0.130.18-0.320.220.140.14-0.400.07-0.35-0.2938GCC0.29-0.020.36-0.190.660.170.270.180.540.040.600.3339GCG-0.270.36-0.31-0.13-0.42-0.13-0.17-0.18-0.430.31-0.130.0340GCT0.24-0.270.590.000.57-0.220.090.020.38-0.050.190.2841GGA-0.610.61-0.370.02-0.35-0.010.13-0.29-0.330.13-0.200.1042GGC0.340.270.190.010.820.540.530.530.290.240.010.4443GGG-0.47-0.28-0.43-0.28-0.730.10-0.45-0.50-0.670.72-0.52-0.0244GGT0.43-0.010.30-0.090.40-0.32-0.24-0.190.390.640.100.3345GTA-0.040.52-0.110.41-0.340.07-0.33-0.10-0.15-0.09-0.36-0.1246GTC0.260.020.30-0.130.860.420.450.160.760.180.620.3147GTG-0.05-0.01-0.35-0.20-0.130.36-0.20-0.18-0.190.28-0.21-0.0148GTT0.47-0.220.590.130.59-0.31-0.13-0.070.42-0.270.03-0.3049TAA0.000.000.000.000.000.000.000.000.000.000.000.0050TAC0.44-0.200.130.170.650.020.170.240.350.100.150.3051TAG0.000.000.000.000.000.000.000.000.000.000.000.0052TAT0.03-0.19-0.090.17-0.03-0.37-0.270.250.18-0.35-0.19-0.0353TCA-0.540.28-0.330.17-0.390.130.260.19-0.45-0.38-0.33-0.2854TCC0.08-0.280.22-0.100.730.220.390.410.35-0.160.42-0.0655TCG-0.380.29-0.43-0.17-0.490.110.01-0.01-0.62-0.34-0.31-0.2356TCT-0.08-0.210.41-0.120.47-0.320.330.120.14-0.140.07-0.1457TGA0.000.000.000.000.000.000.000.000.000.000.000.0058TGC0.130.270.190.100.48-0.19-0.28-0.13-0.020.43-0.350.0959TGG-0.490.010.000.12-0.090.080.05-0.060.100.26-0.06-0.0960TGT-0.22-0.10-0.19-0.240.710.32-0.160.170.520.620.090.3861TTA-0.27-0.06-0.25-0.20-0.310.11-0.240.030.09-0.57-0.21-0.3662TTC0.680.300.570.530.84-0.35-0.50-0.080.26-0.27-0.45-0.2663TTG-0.39-0.05-0.33-0.40-0.45-0.17-0.54-0.27-0.37-0.45-0.27-0.4064TTT0.18-0.29-0.30-0.250.520.320.150.130.540.020.450.12ATAATCATGATTCAACACCAGCATCCAcccCCGCCT131415161718192021222324表C.5繼續(xù)252627282930313233343536CGACGCCGGCGTCTACTCCTGCTTGAAGACGAGGAT1AAA0.270.060.160.210.20-0.060.050.200.14-0.10-0.46-0.142AAC0.10-0.170.02-0.160.66-0.15-0.090.000.03-0.140.180.023AAG-0.56-0.25-0.50-0.27-0.51-0.23-0.38-0.060.610.560.410.424AAT0.100.120.120.340.04-0.400.07-0.30-0.13-0.130.020.195ACA-0.180.210.020.38-0.47-0.150.25-0.23-0.190.26-0.100.206ACC-0.10-0.120.190.080.530.380.540.460.01-0.240.16-0.157ACG-0.46-0.03-0.30-0.21-0.72-0.42-0.53-0.470.160.540.490.458ACT0.49-0.060.810.170.350.140.420.31-0.40-0.490.070.019AGA-0.540.30-0.310.31-0.130.300.43-0.280.050.410.070.2810AGC0.14-0.09-0.09-0.120.660.160.060.32-0.25-0.16-0.12-0.2611AGG-0.630.20-0.300.21-0.70-0.28-0.18-0.620.010.44-0.19-0.1212AGT0.280.24-0.150.320.620.300.190.18-0.390.01-0.20-0.0613ATA-0.230.310.100.20-0.31-0.210.41-0.330.060.470.110.3314ATC0.69-0.210.550.060.84-0.590.31-0.050.08-0.420.300.1515ATG0.21-0.080.31-0.090.05-0.17-0.020.19-0.030.020.07-0.0116ATT0.50-0.190.54-0.080.62-0.440.07-0.30-0.120.25-0.11-0.0117CAA-0.380.10-0.31-0.09-0.16-0.30-0.21-0.390.440.590.110.5618CAC0.100.000.02-0.040.800.290.190.140.13-0.280.14-0.1619CAG-0.430.29-0.20-0.09-0.310.160.47-0.32-0.15-0.17-0.15-0.2620CAT0.24-0.15-0.190.13-0.06-0.26-0.10-0.42-0.12-0.08-0.020.3721CCA-0.52-0.20-0.380.20-0.360.210.180.18-0.14-0.06-0.100.2822CCC-0.220.25-0.160.310.570.610.740.420.50-0.050.54-0.3123CCG-0.11-0.170.55-0.20-0.52-0.06-0.300.10-0.130.160.210.1124CCT0.500.390.560.350.360.540.610.30-0.15-0.390.10-0.1825CGA-0.34-0.05-0.38-0.04-0.53-0.27-0.39-0.300.260.610.270.4526CGC0.370.020.270.010.810.510.120.450.13-0.210.01-0.2227CGG-0.560.21-0.64-0.06-0.560.34-0.25-0.25-0.260.640.080.2628CGT0.30-0.02-0.100.010.560.25-0.200.01-0.13-0.100.040.1729CTA-0.54-0.52-0.54-0.37-0.46-0.41-0.16-0.410.790.920.770.8430CTC0.680.770.330.790.930.660.810.620.00-0.680.47-0.5031CTG0.20-0.060.28-0.170.040.150.020.10-0.300.24-0.20-0.1232CTT0.690.520.580.560.500.220.690.25-0.17-0.01-0.53-0.4933GAA0.300.240.110.190.290.30-0.010.290.07-0.04-0.49-0.2734GAC-0.01-0.13-0.09-0.180.660.050.140.170.08-0.040.15-0.0835GAG-0.45-0.19-0.51-0.34-0.490.67-0.31-0.210.590.610.470.5336GAT0.54-0.050.230.250.28-0.490.15-0.15-0.05-0.14-0.070.1637GCA-0.330.27-0.080.27-0.330.310.010.17-0.260.26-0.220.1938GCC0.06-0.020.020.040.730.520.520.520.17-0.240.14-0.3039GCG-0.440.15-0.37-0.20-0.580.08-0.42-0.110.210.570.400.3340GCT0.550.180.660.210.450.510.470.45-0.39-0.470.05-0.2741GGA-0.63-0.34-0.54-0.35-0.290.420.04-0.260.190.510.090.4542GGC0.610.560.610.550.730.540.250.42-0.07-0.20-0.13-0.3543GGG-0.71-0.36-0.59-0.46-0.690.78-0.51-0.520.360.760.310.5244GGT0.450.010.25-0.170.640.84-0.040.31-0.130.100.050.1045GTA-0.08-0.20-0.04-0.34-0.040.00-0.32-0.25-0.030.400.000.3046GTC0.670.450.430.420.890.550.700.580.130.23-0.01-0.4547GTG0.06-0.12-0.27-0.25-0.20-0.03-0.40-0.040.250.550.230.2348GTT0.420.130.580.120.560.170.400.28-0.25-0.11-0.37-0.3349TAA0.000.000.000.000.000.000.000.000.000.000.000.0050TAC0.31-0.02-0.060.100.760.070.220.310.06-0.330.180.0251TAG0.000.000.000.000.000.000.000.000.000.000.000.0052TAT0.10-0.260.010.150.15-0.53-0.13-0.33-0.10-0.310.000.4753TCA-0.15-0.08-0.290.30-0.57-0.25-0.11-0.280.010.410.280.4054TCC-0.15-0.230.040.010.660.380.510.410.48-0.120.380.0055TCG-0.28-0.19-0.42-0.07-0.69-0.33-0.53-0.440.420.670.680.6256TCT0.680.100.750.240.260.220.20-0.09-0.17-0.500.420.0057TGA0.000.000.000.000.000.000.000.000.000.000.000.0058TGC0.02-0.13-0.24-0.140.73-0.07-0.240.040.06-0.210.240.0059TGG-0.500.22-0.400.00-0.470.57-0.110.22-0.080.260.19-0.1360TGT-0.100.42-0.540.350.490.38-0.070.05-0.180.000.020.2061TTA0.12-0.29-0.12-0.200.00-0.28-0.04-0.210.660.740.620.5662TTC0.64-0.420.23-0.330.83-0.66-0.12-0.290.26-0.070.300.1563TTG-0.28-0.21-0.28-0.28-0.54-0.44-0.49-0.430.840.900.830.6964TTT0.640.220.540.340.63-0.290.51-0.14-0.19-0.06-0.12-0.04CGACGCCGGCGTCTACTCCTGCTTGAAGACGAGGAT252627282930313233343536表C.5繼續(xù)373839404142434445464748GCAGCCGCGGCTGGAGGCGGGGGTGTAGTCGTGGTT1AAA0.00-0.15-0.210.310.11-0.16-0.28-0.010.29-0.08-0.220.182AAC0.41-0.360.270.140.19-0.210.36-0.200.12-0.360.07-0.103AAG0.280.520.180.25-0.050.410.170.68-0.040.190.100.114AAT0.20-0.340.13-0.050.35-0.040.560.100.35-0.220.290.135ACA-0.240.280.150.06-0.430.26-0.130.10-0.150.370.250.336ACC0.02-0.190.30-0.11-0.40-0.04-0.32-0.20-0.03-0.010.19-0.227ACG-0.320.70-0.190.220.200.480.240.43-0.450.17-0.270.068ACT-0.06-0.25-0.08-0.260.03-0.300.670.110.04-0.030.470.049AGA-0.300.45-0.020.09-0.540.27-0.140.33-0.21-0.020.140.0710AGC0.17-0.390.340.55-0.15-0.140.10-0.16-0.12-0.16-0.010.0111AGG-0.490.510.04-0.23-0.260.510.410.600.170.55-0.10-0.0612AGT0.12-0.140.030.010.040.200.290.090.450.400.370.3713ATA0.210.000.400.330.260.390.050.310.400.030.430.1914ATC0.45-0.430.45-0.190.64-0.130.44-0.130.53-0.380.34-0.1315ATG-0.050.52-0.290.12-0.14-0.10-0.360.430.290.08-0.210.1416ATT0.46-0.410.30-0.260.68-0.220.59-0.240.59-0.390.14-0.2117CAA0.610.430.540.54-0.040.34-0.270.270.600.610.580.5518CAC0.21-0.340.14-0.210.16-0.350.05-0.41-0.24-0.330.87-0.3619CAG-0.320.21-0.36-0.19-0.31-0.08-0.450.36-0.31-0.01-0.27-0.2820CAT0.35-0.110.10-0.040.320.290.520.280.25-0.150.04-0.1021CCA0.010.240.330.19-0.41-0.03-0.090.15-0.070.550.260.4422CCC-0.37-0.460.06-0.40-0.47-0.410.34-0.350.05-0.200.30-0.2923CCG-0.250.760.090.34-0.040.330.010.35-0.420.31-0.14-0.0624CCT-0.29-0.45-0.38-0.46-0.22-0.440.05-0.08-0.06-0.080.40-0.2225CGA0.090.360.110.11-0.110.570.170.54-0.190.24-0.070.4026CGC-0.04-0.280.22-0.07-0.06-0.210.01-0.14-0.27-0.12-0.15-0.1327CGG-0.520.23-0.49-0.21-0.450.520.160.55-0.66-0.19-0.55-0.2128CGT0.250.240.140.00-0.03-0.010.17-0.070.360.430.480.3129CTA0.760.730.750.780.730.710.590.740.830.720.810.7230CTC0.43-0.340.420.11-0.20-0.59-0.34-0.540.710.390.750.4731CTG-0.430.48-0.46-0.120.010.23-0.270.29-0.29-0.12-0.49-0.1532CTT0.29-0.330.33-0.210.17-0.39-0.23-0.080.640.310.730.3733GAA0.02-0.13-0.250.09-0.07-0.13-0.42-0.160.21-0.07-0.330.0234GAC0.29-0.160.24-0.020.15-0.230.02-0.280.16-0.060.18-0.1635GAG0.320.510.120.420.450.590.210.670.440.440.250.4036GAT0.26-0.250.01-0.130.34-0.040.590.100.29-0.19-0.01-0.0237GCA-0.130.410.180.10-0.25-0.03-0.21-0.12-0.160.420.250.3838GCC-0.17-0.150.28-0.18-0.680.87-0.56-0.090.000.050.31-0.1239GCG-0.330.70-0.190.33-0.060.260.050.19-0.500.28-0.320.0140GCT-0.20-0.31-0.13-0.300.01-0.150.21-0.02-0.230.080.43-0.1141GGA0.110.380.260.44-0.010.650.540.710.410.630.630.6242GGC-0.460.96-0.28-0.11-0.21-0.38-0.15-0.37-0.42-0.14-0.33-0.2743GGG-0.120.44-0.240.230.300.730.670.84-0.030.600.310.3244GGT-0.050.04-0.020.010.210.170.21-0.070.130.240.200.1245GTA0.240.460.140.410.260.42-0.070.200.280.390.070.4646GTC0.27-0.350.13-0.26-0.04-0.31-0.43-0.440.460.120.300.0447GTG-0.030.66-0.340.200.320.46-0.010.36-0.13-0.06-0.48-0.1648GTT0.21-0.370.23-0.370.44-0.260.06-0.170.450.080.49-0.0449TAA0.000.000.000.000.000.000.000.000.000.000.000.0050TAC0.24-0.330.25-0.080.34-0.310.24-0.290.31-0.120.34-0.1151TAG0.000.000.000.000.000.000.000.000.000.000.000.0052TAT0.24-0.330.180.160.370.020.600.150.29-0.350.11-0.1753TCA-0.170.150.190.35-0.280.410.210.30-0.210.200.270.3554TCC-0.16-0.070.38-0.20-0.09-0.39-0.36-0.47-0.20-0.100.30-0.1155TCG-0.270.64-0.140.220.450.740.470.61-0.520.02-0.420.03125<table>tableseeoriginaldocumentpage126</column></row><table>40GCT0.00-0.560.00-0.56-0.36-0.42-0.27-0.480.000.080.94-0.2241GGA0.000.310.00-0.150.130.72-0.490.470.00-0.070.21-0.1242GGC0.000.020.000.040.410.330.380.450.00-0.20-0.44-0.2243GGG0.000.700.000.310.520.840.420.680.000.790.780.4644GGT0.00-0.120.00-0.26-0.34-0.30-0.35-0.420.000.150.52-0.0445GTA0.000.440.000.440.060.17-0.140.180.000.38-0.02-0.0446GTC0.00-0.080.00-0.220.31-0.130.23-0.160.00-0.35-0.61-0.2947GTG0.000.670.000.400.590.790.270.610.000.390.370.4748GTT0.00-0.390.00-0.49-0.28-0.54-0.23-0.580.00-0.260.79-0.2249TAA0.000.000.000.000.000.000.000.000.000.000.000.0050TAC0.00-0.300.000.320.29-0.270.230.040.00-0.33-0.510.1751TAG0.000.000.000.000.000.000.000.000.000.000.000.0052TAT0.00-0.300.000.33-0.10-0.280.340.310.000.030.780.3153TCA0.000.590.000.54-0.550.07-0.40-0.170.000.030.000.4054TCC0.000.110.000.460.04-0.150.27-0.100.00-0.53-0.45-0.5055TCG0.000.740.000.58-0.050.490.020.380.000.480.400.6256TCT0.00-0.270.00-0.22-0.51-0.54-0.26-0.530.000.200.960.0357TGA0.000.000.000.000.000.000.000.000.000.000.000.0058TGC0.00-0.140.000.09-0.030.130.19-0.040.00-0.15-0.33-0.1859TGG0.000.100.00-0.07-0.380.49-0.350.400.000.100.00-0.1160TGT0.000.110.00-0.04-0.50-0.31-0.30-0.260.000.370.610.0961TTA0.000.620.000.31-0.060.04-0.260.130.00-0.08-0.11-0.1362TTC0.00-0.130.000.120.02-0.40-0.18-0.240.00-0.25-0.52-0.0863TTG0.000.380.000.01-0.170.55-0.350.340.000.03-0.21-0.1764TTT0.000.100.00-0.070.08-0.030.350.080.000.100.800.25TAATACTAGTATTCATCCTCGTCTTGATGCTGGTGT495051525354555657585960表C.5繼續(xù)61TTA62TTC63TTG64TTT1AAA0.15-0.030.400.232AAC0.46-0.300.580.193AAG-0.02-0.22-0.02-0.314AAT-0.23-0.040.060.205ACA-0.500.16-0.40-0.136ACC0.43-0.190.530.017ACG-0.240.41-0.340.278ACT0.24-0.340.32-0.099AGA-0.340.66-0.010.3010AGC0.650.040.670.2711AGG-0.240.61-0.440.3712AGT0.06-0.13-0.27-0.3213ATA-0.57-0.25-0.57-0.4914ATC0.66-0.160.690.3715ATG0.36-0.15-0.210.1316ATT-0.03-0.16-0.350.1217CAA-0.110.390.050.0518CAC0.53-0.160.680.0819CAG-0.07-0.16-0.25-0.0420CAT-0.240.03-0.210.0321CCA-0.320.19-0.260.1222CCC0.320.020.400.0423CCG-0.03-0.09-0.120.0024CCT-0.07-0.070.31-0.0525CGA-0.470.12-0.44-0.0926CGC0.380.100.48-0.0827CGG-0.360.78-0.440.3828CGT0.02-0.05-0.13-0.2429CTA-0.460.18-0.47-0.3330CTC0.770.260.850.1531CTG0.220.090.07-0.0832CTT0.120.150.300.0333GAA0.240.510.390.1634GAC0.39-0.050.590.3335GAG-0.25-0.38-0.35-0.4236GAT-0.39-0.18-0.18-0.0137GCA-0.410.06-0.35-0.1938GCC0.470.160.57-0.0339GCG-0.270.29-0.29-0.0740GCT0.21-0.040.27-0.1741GGA-0.66-0.18-0.45-0.3742GGC0.320.080.410.0343GGG-0.490.57-0.400.3444GGT-0.190.12-0.18-0.2345GTA-0.48-0.19-0.53-0.2746GTC0.62-0.100.71-0.1547GTG0.120.16-0.280.1448GTT0.160.080.250.1149TAA0.000.000.000.0050TAC0.66-0.280.720.2851TAG0.000.000.000.0052TAT-0.10-0.15-0.100.1453TCA-0.530.09-0.33-0.1854TCC0.480.000.580.2355TCG-0.230.12-0.39-0.0956TCT0.14-0.090.010.0257TGA0.000.000.000.0058TGC0.41-0.030.550.1059TGG0.090.220.01-0.1460TGT-0.150.130.07-0.1661TTA-0.420.19-0.430.0062TTC0.52-0.390.50-0.1563TTG-0.31-0.18-0.54-0.1564TTT-0.340.14-0.230.39TTATTCTTGTTT61626364128表C.6:CPW矩陣EycAgn'c/'co/z'K12高表達(dá)序列(左側(cè)密碼子在第2列中指出,右側(cè)密碼子在第2行中指出)。宿主細(xì)胞序列數(shù)據(jù)<table>tableseeoriginaldocumentpage129</column></row><table>49TAA0.000.000.000.000.000.000.000.000.000.000.000.0050TAC-0.47-0.71-0.290.841.00-0.681.00-0.731.00-0.641.001.0051TAG0.000.000.000.000.000.000.000.000.000.000.000.0052TAT0.520.500.821.001.000.471.000.801.000.721.001.0053TCA1.000.461.001.001.001.001.00-0.241.001.001.001.0054TCC-0.63-0.73-0.570.730.31-0.781.00-0.691.00-0.771.001.0055TCG0.511.000.681.001.001.001.000.491.001.001.001.0056TCT-0.56-0.760.311.000.29-0.771.00-0.831.001.001.001.0057TGA0.000.000.000.000.000.000.000.000.000.000.000.0058TGC-0.22-0.281.001.001.00-0.481.000.271.00-0.231.000.2059TGG-0.23-0.371.000.711.00-0.200.32-0.391.000.281.001.0060TGT-0.25-0.571.001.00-0.150.661.00-0.781.001.001.001.0061TTA0.941.001.001.001.001.001.000.641.000.781.001.0062TTC-0.49-0.690.140.650.51-0.611.00-0.771.00-0.301.001.0063TTG0.810.831.001.000.531.001.000.631.000.561.001.0064TTT0.540.470.490.940.820.731.000.001.000.761.001.00AAAAACAAGAATACAACCACGACTAGAAGCAGGAGT123456"789101112表C.6繼續(xù)131415161718192021222324ATAATCATGATTCAACACCAGCATCCACCCCCGCCT1AAA1.00-0.36-0.040.440.20-0.510.161.00-0.171.000.050.382AAC1.00-0.72-0.410.690.54-0.70-0.560.470.241.00-0.620.823AAG1.00-0.490.150.320.24-0.64-0.500.540.711.00-0.61-0.034AAT1.000.620.870.941.000.710.701.000.811.000.461.005ACA1.000.831.001.001.001.000.761.001.001.000.811.006ACC1.00-0.67-0.470.300.19-0.470.010.610.371.000.050.817ACG1.001.000.730.800.560.440.881.000.491.000.360.698ACT1.00-0.640.350.35-0.44-0.85-0.731.000.581.00-0.82-0.609AGA1.001.001.001.001.001.001.00-0.331.001.001.001.0010AGC1.00-0.370.380.631.00-0.51-0.170.781.001.00-0.291.0011AGG1.001.001.001.001.001.001.001.001.001.001.001.0012AGT1.000.841.000.860.531.001.001.001.001.000.481.0013ATA1.001.001.001.001.001.001.001.001.001.001.001.0014ATC1.00-0.70-0.530.040.45-0.68-0.420.50-0.180.73-0.57-0.3115ATG1.00-0.480.000.640.33-0.43-0.150.56-0.241.00-0.170.3016ATT1.000.230.820.881.00-0.09-0.240.720.861.000.021.0017CAA1.000.320.660.910.82-0.110.810.581.001.00-0.040.0018CAC1.00-0.73-0.380.271.00-0.72-0.520.55-0.331.00-0.641.0019CAG1.00-0.58-0.260.310.52-0.58-0.480.780.781.00-0.490.4820CAT1.000.530.460.761.000.09-0.101.001.001.000.400.3321CCA1.00-0.610.570.840.370.590.340.691.001.00-0.231.0022CCC1.000.701.001.001.001.001.001.001.001.001.001.0023CCG1.00-0.58-0.370.240.33-0.70-0.530.56-0.421.00-0.50-0.3624CCT1.000.530.491.001.001.000.800.631.001.001.001.0025CGA1.000.581.001.001.001.001.001.001.001.001.001.0026CGC1.000.540.100.351.000.300.440.870.781.000.360.2027CGG1.001.001.001.001.001.001.001.001.001.001.001.0028CGT1.00-0.74-0.400.140.55-0.73-0.690.020.291.00-0.66-0.6129CTA1.001.001.001.001.001.001.001.001.001.001.001.0030CTC1.000.140.810.640.630.470.611.001.001.000.821.0031CTG1.00-0.71-0.430.25-0.28-0.75-0.540.67-0.480.84-0.60-0.0232CTT1.000.781.001.001.001.001.001.001.001.000.631.0013033GAA1.00-0.56-0.110.660.49-0.280.120.67-0.221.00-0.06-0.0234GAC1.00-0.66-0.350.380.87-0.66-0.570.340.511.00-0.401.0035GAG1.00-0.300.270.780.48-0.64-0.520.511.001.00-0.480.6236GAT1.00-0.300.330.850.68-0.280.160.940.421.00-0.380.6537GCA1.00-0.56-0.410.470.76-0.47-0.360.690.410.55-0.62-0.0638GCC1.000.500.600.701.001.000.410.761.001.000.920.4539GCG1.000.07-0.010.170.72-0.52-0.040.630.650.73-0.271.0040GCT1.00-0.78-0.070.490.37-0.76-0.700.180.221.00-0.701.0041GGA1.001.000.881.001.001.001.001.001.001.000.231.0042GGC1.000.250.070.580.92-0.390.260.870.430.71-0.130.7743GGG1.000.711.000.841.000.070.881.000.491.000.271.0044GGT1.00-0.75-0.45-0.040.80-0.67-0.71-0.010.321.00-0.640.4645GTA1.00-0.52-0.560.510.31-0.68-0.650.33-0.471.00-0.63-0.5546GTC1.000.440.770.831.001.000.490.760.211.000.530.3747GTG1.00-0.100.060.760.710.820.931.000.541.000.560.6348GTT1.00-0.780.030.610.39-0.80-0.650.200.331.00-0.55-0.5349TAA0.000.000.000.000.000.000.000.000.000.000.000.0050TAC1.00-0.74-0.550.331.00-0.66-0.55-0.020.231.00-0.481.0051TAG0.000.000.000.000.000.000.000.000.000.000.000.0052TAT1.000.570.910.781.000.180.011.000.711.00-0.260.3053TCA1.001.000.721.001.001.000.691.001.001.000.681.0054TCC1.00-0.81-0.610.041.00-0.69-0.500.601.001.00-0.621.0055TCG1.000.840.110.741.001.001.001.000.291.000.751.0056TCT1.00-0.78-0.470.721.00-0.86-0.820.581.001.00-0.830.0957TGA0.000.000.000.000.000.000.000.000.000.000.000.0058TGC1.00-0.24-0.071.001.00-0.58-0.450.550.051.00-0.581.0059TGG1.00-0.280.000.181.00-0.57-0.341.001.001.00-0.471.0060TGT1.00-0.730.101.00-0.540.251.000.44-0.161.000.571.0061TTA1.001.001.000.851.001.001.001.001.001.001.001.0062TTC1.00-0.74-0.300.731.00-0.79-0.710.410.231.00-0.73-0.2863TTG1.000.820.231.001.001.000.840.670.581.000.701.0064TTT1.000.190.320.861.001.000.901.001.001.001.001.00ATAATCATGATTCAACACCAGCATCCAcccCCGCCT131415161718192021222324表C.6繼續(xù)252627282930313233343536CGACGCCGGCGTCTACTCCTGCTTGAAGACGAGGAT1AAA1.000.431.00-0.361.000.76-0.460.84-0.17-0.53-0.320.262AAC1.00-0.261.00-0.700.42-0.38-0.660.80-0.47-0.62-0.150.01AAG0.33-0.321.00-0.611.000.19-0.410.730.950.821.000.844AAT1.001.001.001.001.001.000.741.000.820.590.790.85ACA1.001.001.000.341.001.000.861.000.410.410.741.006ACC1.000.331.00-0.561.000.62-0.450.61-0.36-0.400.220.317ACG1.000.901.000.791.001.00-0.311.000.891.001.001.008ACT1.00-0.591.00-0.77-0.321.00-0.751.00-0.61-0.830.18-0.279AGA1.001.001.001.001.001.001.001.001.001.001.001.0010AGC1.000.071.00-0.691.001.00-0.430.45-0.20-0.190.750.3311AGG1.001.001.001.001.001.001.001.001.001.001.001.0012AGT1.001.001.001.001.001.000.901.000.790.791.000.8813ATA1.001.001.001.001.001.001.001.001.001.001.001.0014ATC1.00-0.341.00-0.631.00-0.67-0.590.13-0.57-0.72-0.08-0.0715ATG1.000.351.00-0.491.000.27-0.410.85-0.10-0.450.240.4916ATT1.000.551.00-0.121.000.64-0.131.000.530.570.860.80<table>tableseeoriginaldocumentpage132</column></row><table><table>tableseeoriginaldocumentpage133</column></row><table>56TCT-0.78-0.19-0.62-0.831.00-0.521.00-0.84-0.87-0.140.32-0.7457TGA0.000.000.000.000.000.000.000.000.000.000.000.0058TGC-0.670.470.400.551.000.411.00-0.60-0.440.360.25-0.0759TGG-0.460.800.07-0.281.00-0.261.00-0.26-0.280.670.42-0.4860TGT0.570.67-0.21-0.410.300.071.00-0.25-0.701.000.060.3361TTA0.710.771.000.421.000.910.760.680.411.001.000.8262TTC-0.370.110.19-0.630.18-0.190.71-0.72-0.700.720.43-0.7163TTG1.000.880.910.801.001.001.000.890.701.001.000.8264TTT0.570.540.44-0.551.000.800.890.230.260.790.690.12GCAGCCGCGGCTGGAGGCGGGGGTGTAGTCGTGGTT373839404142434445464748表C.6繼續(xù)495051525354555657585960TAATACTAGTATTCATCCTCGTCTTGATGCTGGTGT1AAA0.00-0.310.000.211.00-0.350.66-0.650.00-0.060.020.202AAC0.00-0.710.000.401.00-0.740.81-0.710.00-0.62-0.450.183AAG0.000.140.000.351.000.210.24-0.810.00-0.29-0.070.114AAT0.001.000.001.001.000.511.001.000.001.001.001.00ACA0.001.000.001.001.001.001.000.310.001.000.401.006ACC0.00-0.600.000.381.00-0.150.81-0.690.00-0.27-0.53-0.617ACG0.001.000.000.851.001.001.000.670.001.001.001.008ACT0.00-0.770.000.530.36-0.880.00-0.880.00-0.451.001.009AGA0.001.000.001.001.001.001.001.000.001.001.001.0010AGC0.00-0.400.000.491.00-0.541.00-0.470.00-0.280.700.4211AGG0.001.000.001.001.001.001.001.000.001.001.001.0012AGT0.000.690.001.001.000.321.000.300.001.001.001.0013ATA0.001.000.001.001.001.001.001.000.001.001.001.0014ATC0.00-0.690.000.281.00-0.840.84-0.640.00-0.41-0.58-0.5315ATG0.00-0.430.000.561.00-0.600.80-0.420.000.400.00-0.3316ATT0.000.300.000.701.000.030.87-0.530.000.531.000.7017CAA0.000.400.000.891.000.541.001.000.001.000.641.0018CAC0.00-0.770.000.61-0.27-0.401.00-0.830.00-0.60-0.57-0.0519CAG0.00-0.530.000.231.00-0.310.77-0.800.00-0.51-0.250.1420CAT0.001.000.000.711.001.001.001.000.000.371.001.0021CCA0.000.340.000.751.00-0.221.000.340.00-0.740.471.0022CCC0.001.000.001.001.001.001.001.000.001.001.001.0023CCG0.00-0.440.000.370.44-0.521.00-0.780.000.32-0.41-0.4124CCT0.00-0.770.000.700.05-0.681.00-0.580.001.001.001.0025CGA0.001.000.001.001.001.001.001.000.001.001.001.0026CGC0.000.010.000.501.00-o.u1.00-0.140.000.36-0.521.0027CGG0.001.000.001.001.001.001.001.000.001.001.001.0028CGT0.00-0.640.00-0.160.13-0.750.77-0.870.00-0.700.560.1529CTA0.001.000.001.001.001.001.001.000.001.001.001.0030CTC0.00-0.310.001.001.00-0.501.00-0.360.001.001.001.0031CTG0.00-0.670.000.240.74-0.600.70-0.800.00-0.48-0.44-0.4232CTT0.000.020.001.000.37-0.351.00-0.520.00-0.251.001.0033GAA0.00-0.540.000.641.00-0.560.77-0.570.000.400.28-0.3434GAC0.00-0.660.000.370.71-0.800.54-0.490.00-0.62-0.610.4035GAG0.00-0.070.000.651.00-0.531.00-0.600.00-0.25-0.380.4436GAT0.00-0.180.000.830.83-0.411.00-0.150.000.430.940.4637GCA0.00-0.420.000.680.69-0.430.76-0.700.000.50-0.62-0.4738GCC0.000.440.000.081.000.421.00-0.380.000.22-0.020.0239GCG0.000.410.001.001.000.110.72-0.610.000.410.69-O.IO13440GCT0.00-0.830.000.001.00-0.810.37-0.890.00-0.620.651.0041GGA0.000.730.001.000.461.001.001.000.001.001.001.0042GGC0.00-0.360.000.571.00-0.331.000.400.000.32-0.50-0.0743GGG0.001.000.001.001.001.001.000.360.001.001.001.0044GGT0.00-0.680.000.160.15-0.790.87-0.870.00-0.620.44-0.0345GTA0.00-0.400.000.641.00-0.821.00-0.840.00-0.46-0.71-0.6846GTC0.00-0.020.000.871.000.721.000.710.001.000.651.0047GTG0.000.110.000.771.000.35LOO0.330.00-0.020.171.0048GTT0.00-0.710.00-0.161.00-0.811.00-0.850.00-0.610.700.5449TAA0.000.000.000.000.000.000.000.000.000.000.000.0050TAC0.00-0.780.000.611.00-0.760.67-0.770.00-0.71-0.571.0051TAG0.000.000.000.000.000.000.000.000.000.000.000.0052TAT0.000.810.001.001.000.221.001.000.001.001.000.3453TCA0.000.240.001.001.001.001.000.140.001.000.321.0054TCC0.00-0.680.000.531.00-0.630.35-0.520.00-0.71-0.80-0.5455TCG0.001.000.001.001.001.001.00-0.260.001.001.001.0056TCT0.00-0.820.000.030.14-0.84-0.63-0.880.00-0.440.43-0.5557TCA0.000.000.000.000.000.000.000.000.000.000.000.0058TGC0.00-0.760.001.001.00-0.580.23-0.760.00-0.54-0.44-0.2659TGG0.00-0.410.000.521.00-0.501.00-0.710.001.000.00-0.5660TGT0.001.000.001.001.00-0.011.00-0.040.001.001.001.0061TTA0.000.610.001.001.001.001.001.000.001.001.001.0062TTC0.00-0.740.000.111.00-0.790.72-0.860.00-0.69-0.57-0.1563TTG0.001.000.001.001.001.001.000.150.001.000.171.0064TTT0.000.820.000.871.000.561.001.000.001.001.001.00TAATACTAGTATTCATCCTCGTCTTGATGCTGGTGT495051525354555657585960表C.6繼續(xù)61626364TTATTCTTGTTT1AAA0.87-0.430.870.692AAC1.00-0.720.830.573AAG0.78-0.530.780.424AAT0.801.001.000.935ACA1.001.001.001.006ACC0.84-0.591.000.287ACG0.750.430.740.798ACT1.00-0.640.580.319AGA1.001.001.001.0010AGC1.00-0.480.560.8811AGG1.001.001.001.0012AGT1.000.400.591.0013ATA1.001.001.001.0014ATC1.00-0.691.000.2915ATG0.76-0.520.760.8116ATT1.000.050.850.8817CAA1.001.001.000.8818CAC1.00-0.770.560.4919CAG1.00-0.660.750.5820CAT1.000.740.671.0021CCA1.001.001.001.0022CCC1.001.001.001.0023CCG0.80-0.700.410.4024CCT1.000.661.000.5025CGA1.001.001.001.0026CGC1.00-0.481.000.3627CGG1.001.001.001.0028CGT1.00-0.611.000.5129CTA1.001.001.001.0030CTC1.001.001.000.7331CTG0.57-0.720.460.4432CTT1.000.271.001.0033GAA0.95-0.350.950.7634GAC0.86-0.740.580.4035GAG0.88-0.631.000.4636GAT0.840.220.840.8737GCA1.00-0.700.830.2938GCC1.00-0.061.000.8439GCG1.00-0.441.000.7640GCT0.78-0.441.000.7341GGA1.001.001.001.0042GGC1.00-0.600.900.3343GGG1.000.841.001.0044GGT0.76-0.480.880.3645GTA1.00-0.630.450.0046GTC1.00-0.431.000.7647GTG1.00-0.550.770.7248GTT1.00-0.060.840.9049TAA0.000.000.000.0050TAC1.00-0.751.000.4251TAG0.000.000.000.0052TAT1.000.651.000.9153TCA1.001.001.001.0054TCC0.59-0.790.590.3355TCG1.000.421.000.7856TCT0.58-0.730.570.3057TGA0.000.000.000.0058TGC1.000.161.00-0.4759TGG0.51-0.451.000.6160TGT1.00-0.051.001.0061TTA1.001.001.001.0062TTC1.00-0.761.000.1663TTG1.000.700.581.0064TTT1.001.001.000.92TTATTCTTGTTT61626364<table>tableseeoriginaldocumentpage137</column></row><table>49TAA0.000.000.000.000.000.000.000.000.000.000.000.0050TAC0.12-0.16-0.06-0.170.130.05-0.140.14-0.59-0.55-0.61-0.5751TAG0.000.000.000.000.000.000.000.000.000.000.000.0052TAT-0.140.140.230.14-0.150.28-0.030.080.500.820.590.7053TCA-0.38-0.39-0.19-0.39-0.28-0.360.06-0.58-0.50-0.30-0.42-0.2654TCC0.290.350.46-0.180.250.110.54-0.33-0.440.350.37-0.0855TCG0.35-0.06-0.140.220.18-0.26-0.31-0.240.090.000.160.4156TCT-0.040.400.330.240.200.490.63-0.010.760.910.750.7457TGA0.000.000.000.000.000.000.000.000.000.000.000.0058TGC0.21-0.16-0.170.090.27-0.15-0.100.380.10-0.27-0.39-0.2259TGG0.050.18-0.10-0.150.160.21-0.210.15-0.32-0.090.01-0.2760TGT-0.180.460.00-0.22-0.110.46-0.230.000.770.910.740.5961TTA-0.47-0.45-0.45-0.43-0.25-0.32-0.19-0.34-0.52-0.18-0.30-0.3362TTC0.19-0.010.060.040.25-0.200.20-0.25-0.62-0.45-0.63-0.4163TTG-0.25-0.09-0.16-0.28-0.22-0.19-0.13-0.35-0.270.35-0.200.1364TTT-0.180.060.13-0.08-0.15-0.060.06-0.040.580.930.670.66AAAAACAAGAATACAACCACGACTAGAAGCAGGAGT123456789101112表C.7繼續(xù)131415161718192021222324ATAATCATGATTCAACACCAGCATCCACCCCCGCCTIAAA-0.41-0.20-0.080.26-0.090.320.260.200.180.450.260.342AAC0.240.090.090.150.170.250.320.100.190.400.240.133AAG0.140.180.200.200.00-0.28-0.30-0.40-0.30-0.28-0.43-0.374AAT-0.31-0.08-0.08-0.08-0.34-0.06-0.07-0.18-0.11-0.12-0.22-0.185ACA-0.120.340.060.35-0.150.140.050.170.230.110.080.076ACC-0.09-0.040.43-0.390.150.240.30-0.100.220.380.580.297ACG-0.01-0.13-0.30-0.210.25-0.13-0.120.170.03-0.23-0.27-0.168ACT0.030.740.570.46-0.53-0.290.02-0.41-0.030.050.14-0.429AGA-0.40-0.21-0.37-0.15-0.100.47-0.270.350.380.490.400.2910AGC0.12-0.01-0.12-0.11-0.030.10-0.12-0.160.500.160.110.4411AGG-0.350.370.01-0.110.00-0.24-0.37-0.25-0.13-0.31-0.240.2112AGT-0.210.23-0.14-0.06-0.250.38-0.06-0.26-0.29-0.01-0.03-0.0713ATA-0.240.590.420.61-0.37-0.04-0.13-0.12-0.200.120.03-0.2014ATC0.15-0.11-0.03-0.160.490.490.500.360.380.410.390.3915ATG0.130.070.00-0.100.43-0.15-0.220.140.490.13-0.09-0.0416ATT-0.090.02-0.08-0.01-0.36-0.23-0.25-0.33-0.19-0.35-0.30-0.2317CAA-0.31-0.10-0.210.000.550.690.700.720.530.600.820.5418CAC0.200.150.240.190.420.340.430.150.180.390.280.2419CAG0.220.000.170.100.07-0.37-0.42-0.27-0.18-0.30-0.32-0.3720CAT-0.17-0.05-0.15-0.16-0.20-0.19-0.27-0.11-0.24-0.17-0.17-0.1621CCA0.000.270.040.390.190.040.360.38-0.390.200.54-0.0422CCC0.300.500.530.300.470.400.620.23-0.280.530.690.2723CCG0.17-0.25-0.26-0.32-0.18-0.23-0.120.07-0.25-0.39-0.13-0.0724CCT0.240.650.600.43-0.22-0.010.02-0.120.21-0.060.33-0.1525CGA-0.50-0.20-0.040.310.12-0.380.090.01-0.450.310.410.0226CGC0.310.010.300.470.560.330.390.400.070.220.330.5527CGG-0.33-0.31-0.30-0.27-0.23-0.43-0.39-0.27-0.48-0.48-0.37-0.4028CGT0.120.310.420.38-0.02-0.23-0.090.05-0.40-0.24-0.07-0.1529CTA-0.380.320.380.47-0.29-0.010.66-0.21-0.57-0.090.37-0.4430CTC0.280.200.390.110.600.670.680.380.490.530.560.3131CTG0.27-0.29-0.30-0.17-0.25-0.27-0.14-0.010.28-0.34-0.38-0.1232CTT0.450.550.510.42-0.27-0.36-0.24-0.34-0.03-0.23-0.15-0.40138<table>tableseeoriginaldocumentpage139</column></row><table>17CAA0.240.570.560.56-0.180.750.640.760.160.090.100.1818CAC-0.20-0.32-0.14-0.190.390.200.230.180.24-0.11-0.230.0019CAG0.01-0.34-0.31-0.350.02-0.38-0.34-0.46-0.140.050.05-0.1720CAT-0.140.220.11-0.07-0.190.01-0.080.14-0.070.110.03-0.0121CCA-0.14-0.12-0.07-0.21-0.490.440.690.540.300.260.230.3522CCC-0.260.040.08-0.160.270.480.660.380.05-0.36-0.36-0.1723CCG-0.05-0.220.140.01-0.20-0.43-0.18-0.380.200.270.320.2724CCT-0.270.200.34-0.05-0.340.380.510.10-0.44-0.31-0.38-0.4525CGA-0.61-0.33-0.22-0.24-0.530.670.590.550.510.380.550.2526CGC-0.20-0.110.250.240.410.090.160.420.31-0.35-0.46-0.0327CGG-0.36-0.34-0.22-0.28-0.62-0.54-0.45-0.390.230.460.280.2328CGT-0.55-0.140.08-0.28-0.42-0.29-0.19-0.19-0.12-0.27-0.25-0.1429CTA-0.64-0.16-0.16-0.37-0.700.740.680.320.370.590.490.2630CTC0.500.350.26-0.100.320.560.710.490.28-0.46-0.30-0.1631CTG0.45-0.39-0.10-0.040.18-0.40-0.36-0.280.310.490.350.4232CTT-0.11-0.030.16-0.25-0.08-0.080.20-0.02-0.42-0.17-0.36-0.4533GAA0.290.480.020.370.330.600.270.51-0.13-0.06-0.06-0.0234GAC-0.01-0.07-0.20-0.270.360.380.320.230.30-0.07-0.130.0335GAG-0.11-0.43-0.52-0.36-0.23-0.32-0.54-0.580.220.200.27-0.0336GAT-0.260.150.280.09-0.25-0.07-0.120.04-0.190.040.08-0.0137GCA0.320.470.450.52-0.060.410.460.470.110.090.150.1738GCC0.070.070.260.010.440.560.610.440.11-0.26-0.41-0.2039GCG-0.24-0.34-0.050.04-0.21-0.57-0.37-0.360.220.370.350.3240GCT-0.56-0.14-0.05-0.28-0.320.220.45-0.18-0.39-0.19-0.11-0.3941GGA0.190.440.080.270.090.410.120.04-0.24-0.040.020.0742GGC0.06-0.10-0.12-0.050.620.310.240.340.32-0.15-0.230.0343GGG-0.28-0.20-0.32-0.23-0.04-0.39-0.45-0.530.270.520.420.4144GGT-0.120.390.310.150.020.04-0.07-0.39-0.32-0.29-0.21-0.3445GTA-0.060.350.150.170.170.460.380.550.130.150.130.3246GTC0.450.190.100.160.620.660.580.570.29-0.30-0.35-0.1147GTG0.44-0.36-0.150.130.39-0.52-0.57-0.430.180.520.330.3548GTT-0.370.050.07-0.20-0.13-0.110.18-0.14-0.35-0.17-0.12-0.2749TAA0.000.000.000.000.000.000.000.000.000.000.000.0050TAC0.20-0.20-0.43-0.440.320.460.240.470.320.050.140.1651TAG0.000.000.000.000.000.000.000.000.000.000.000.0052TAT0.390.540.240.41-0.230.050.070.29-0.24-0.060.06-0.0953TCA0.450.430.430.42-0.090.330.370.20-0.10-0.010.000.1554TCC0.170.020.07-0.190.500.510.580.320.340.060.140.0455TCG-0.24-0.45-0.130.11-0.28-0.50-0.29-0.260.400.360.360.4156TCT0.120.500.470.22-0.17-0.040.38-0.23-0.38-0.32-0.15-0.3457TGA0.000.000.000.000.000.000.000.000.000.000.000.0058TGC-0.24-0.27-0.19-0.020.670.090.100.210.47-0.07-0.040.2759TGG0.410.06-0.04-0.020.01-0.23-0.280.020.060.04-0.11-0.0360TGT-0.100.490.530.34-0.04-0.21-0.33-0.14-0.42-0.24-0.15-0.1261TTA0.630.770.440.600.290.660.630.61-0.020.310.170.1262TTC0.09-0.30-0.24-0.440.370.370.360.440.460.11-0.050.3763TTG0.26-0.14-0.13-0.15-0.19-0.05-0.18-0.260.100.430.250.0964TTT0.380.470.420.34-0.27-0.040.240.16-0.28-0.120.14-0.18CGACGCCGGCGTCTACTCCTGCTTGAAGACGAGGAT252627282930313233343536表C.7繼續(xù)373839404142434445464748GCAGCCGCGGCTGGAGGCGGGGGTGTAGTCGTGGTT140<table>tableseeoriginaldocumentpage141</column></row><table>56TCT-0.140.050.50-0.250.660.830.750.53-0.37-0.230.33-0.4257TGA0.000.000.000.000.000.000.000.000.000.000.000.0058TGC0.21-0.20-0.270.03-0.28-0.34-0.22-0.330.23-0.07-0.010.3759TGG0.050.29-0.200.07-0.180.04-0.080.360.270.26-0.370.1660TGT0.120.420.310.400.780.870.840.67-0.13-0.10-0.17-0.1761TTA0.370.410.560.29-0.070.260.050.140.280.490.550.4162TTC0.17-0.26-0.14-0.05-0.45-0.55-0.32-0.540.31-0.230.200.1063TTG-0.150.220.09-0.050.160.490.270.270.280.430.250.1364TTT-0.04-0.010.230.140.670.770.770.57-0.04-0.160.160.05GCAGCCGCGGCTGGAGGCGGGGGTGTAGTCGTGGTT373839404142434445464748表C.7繼續(xù)495051525354555657585960TAATACTAGTATTCATCCTCGTCTTGATGCTGGTGT1AAA0.00-0.100.00-0.03-0.260.150.230.380.000.00-0.07-0.132AAC0.000.100.000.030.380.410.390.270.00-0.21-0.17-0.053AAG0.000.120.000.170.260.440.170.320.000.160.160.024AAT0.000.030.00-0.11-0.11-0.050.210.050.000.330.19-0.065ACA0.00-0.200.00-0.13-0.17-0.18-0.06-0.190.00-0.15-0.38-0.126ACC0.000.250.000.050.370.260.190.080.00-0.150.36-0.297ACG0.000.220.000.130.260.300.020.320.000.360.330.488ACT0.00-0.280.00-0.26-0.50-0.40-0.26-0.580.00-0.13-0.27-0.589AGA0.00-0.130.00-0.19-0.030.11-0.060.490.000.34-0.400.0810AGC0.000.020.00-0.100.350.350.230.400.00-0.03-0.36-0.0911AGG0.000.490.000.390.010.640.100.190.000.48-0.300.1512AGT0.000.250.00-0.12-0.350.10-0.10-0.180.000.220.710.3813ATA0.00-0.290.00-0.06-0.21-0.34-0.17-0.130.00-0.16-0.47-0.4714ATC0.000.390.000.360.460.340.360.560.00-0.070.080.1515ATG0.00-0.030.000.030.160.09-0.010.180.000.000.000.0016ATT0.00-0.180.00-0.30-0.29-0.350.14-0.440.000.350.17-0.2417CAA0.00-0.370.00-0.35-0.39-0.41-0.35-0.280.00-0.31-0.42-0.2618CAC0.000.270.000.330.410.210.280.350.000.060.080.0619CAG0.000.250.000.450.310.400.190.370.000.200.480.4120CAT0.00-0.210.00-0.17-0.30-0.32-0.23-0.070.000.05-0.06-0.1821CCA0.00-0.320.00-0.28-0.27-0.540.01-0.540.00-0.57-0.41-0.4522CCC0.000.560.000.410.490.420.490.460.000.180.490.1923CCG0.000.170.000.150.080.06-0.110.050.000.190.060.4924CCT0.00-0.320.00-0.44-0.41-0.400.02-0.540.00-0.22-0.18-0.4125CGA0.00-0.160.000.22-0.18-0.58-0.49-0.500.00-0.680.38-0.3726CGC0.00-0.170.000.110.470.33-0.190.520.00-0.08-0.310.2627CGG0.000.350.000.110.400.620.220.440.000.360.640.5128CGT0.00-0.150.00-0.26-0.40-0.30-0.46-0.440.000.080.540.5329CTA0.00-0.410.00-0.44-0.41-0.54-0.48-0.620.00-0.72-0.56-0.6830CTC0.000.490.000.450.580.480.550.600.000.240.400.3531CTG0.00-0.100.000.14-0.07-0.01-0.290.000.000.040.070.5232CTT0.00-0.090.00-0.21-0.16-0.350.07-0.530.000.020.22-0.3733GAA0.00-0.200.000.00-0.09-0.09-0.170.270.000.01-0.11-0.2334GAC0.000.350.000.290.520.530.540.450.00-0.10-0.130.0435GAG0.000.210.000.210.430.500.310.390.000.180.230.2136GAT0.00-0.210.00-0.20-0.22-0.28-0.22-0.150.000.060.130.0337GCA0.00-0.350.00-0.07-0.35-0.41-0.40-0.380.00-0.32-0.48-0.1438GCC0.000.380.000.150.430.360.330.460.00-0.240.28-0.1339GCG0.000.210.000.400.480.37-0.190.350.000.270.220.5514240GCT0.00-0.450.00-0.47-0.50-0.54-0.33-0.530.00-0.05-0.06-0.1141GGA0.000.040.00-0.09-0.260.790.040.060.00-0.060.160.1642GGC0.00-0.110.000.000.240.35-0.100.360.00-0.28-0.35-0.3343GGG0.000.380.000.290.410.660.590.530.000.560.550.4244GGT0.000.010.00-0.26-0.60-0.28-0.43-0.500.000.490.550.6345GTA0.00-0.400.00-0.28-0.42-0.47-0.40-0.420.00-0.35-0.43-0.3546GTC0.000.440.000.230.550.450.490.480.00-0.080.170.0347GTG0.000.030.000.120.270.18-0.230.340.000.210.100.5148GTT0.00-0.140.00-0.24-0.24-0.47-0.02-0.470.000.260.12-0.2749TAA0.000.000.000.000.000.000.000.000.000.000.000.0050TAC0.000.230.000.340.350.380.350.280.00-0.14-0.08-0.0651TAG0.000.000.000.000.000.000.000.000.000.000.000.0052TAT0.00-0.190.00-0.17-0.25-0.24-0.01-0.080.000.150.06-0.0253TCA0.000.180.000.25-0.15-0.220.21-0.210.000.11-0.130.0954TCC0.000.360.00-0.100.31-0.100.45-0.010.00-0.290.25-0.2155TCG0.000.140.000.450.320.24-0.390.350.00-0.080.110.5856TCT0.00-0.280.00-0.47-0.41-0.50-0.02-0.550.000.140.27-0.1957TGA0.000.000.000.000.000.000.000.000.000.000.000.0058TGC0.00-0.050.000.300.460.29-0.250.260.00-0.08-0.30-0.3159TGG0.000.010.00-0.010.160.33-0.210.100.00-0.030.000.0660TGT0.000.020.00-0.34-0.32-0.17-0.29-0.350.000.500.820.2661TTA0.00-0.260.00-0.27-0.38-0.230.00-0.210.00-0.04-0.44-0.3862TTC0.000.310.000.420.200.070.060.290.00-0.18-0.05-0.1363TTG0.000.260.000.060.10-0.140.110.020.000.170.080.4164TTT0.00-0.180.00-0.24-0.26-0.230.18-0.340.000.270.04-0.08TAATACTAGTATTCATCCTCGTCTTGATGCTGGTGT495051525354555657585960表C.7繼續(xù)61TTA62TTC63TTG64TTT1AAA0.10-0.150.170.242AAC0.19-0.080.380.113AAG0.63-0.110.69-0.144AAT-0.37-0.07-0.020.015ACA-0.23-0.16-0.52-0.126ACC0.460.170.160.077ACG0.440.030.660.268ACT-0.45-0.10-0.45-0.389AGA0.170.30-0.200.3410AGC0.360.150.250.1911AGG0.180.190.520.0112AGT-0.400.06-0.23-0.2813ATA-0.48-0.32-0.61-0.2214ATC0.370.300.550.4015ATG0.660.100.81-0.0716ATT-0.62-0.26-0.52-0.1817CAA-0.14-0.27-0.25-0.3218CAC0.360.120.200.4019CAG0.760.190.490.3420CAT-0.36-0.14-0.43-0.1921CCA-0.29-0.19-0.42-0.1922CCC0.650.560.430.4823CCG0.430.070.62-0.13<table>tableseeoriginaldocumentpage144</column></row><table><table>tableseeoriginaldocumentpage0</column></row><table>48GTT0.490.600.750.280.420.360.650.170.771.000.600.7849TAA0.000.000.000.000.000.000.000.000.000.000.000.0050TAC0.060.510.23-0.25-0.10-0.12-0.15-0.01-0.19-0.34-0.17-0.6751TAG0.000.000.000.000.000.000.000.000.000.000.000.0052TAT-0.220.070.42-0.27-0.170.000.240.550.840.901.000.7053TCA-0.10-0.49-0.25-0.050.150.21-0.33-0.31-0.43-0.45-0.08-0.7454TCC0.330.650.25-0.550.760.12-0.040.02-0.47-0.091.00-0.2255TCG0.100.410.580.000.190.50-0.15-0.390.090.75-0.511.0056TCT-0.020.110.180.35-0.370.030.70-0.071.001.001.001.0057TGA0.000.000.000.000.000.000.000.000.000.000.000.0058TGC0.75-0.400.28-0.221.00-0.56-0.441.000.22-0.41-0.63-0.1959TGGo.n-0.22-0.240.35-0.010.73-0.12-0.32-0.350.11-0.05-0.6160TGT-0.551.00-0.040.380.401.00-0.36-0.181.001.001.001.0061TTA-0.38-0.24-0.46-0.470.15-0.12-0.35-0.650.08-0.250.37-0.1662TTC-0.06-0.080.130.420.22-0.210.55-0.53-0.58-0.44-0.59-0.4263TTG-0.27-0.35-0.47-0.360.59-0.330.30-0.39-0.170.461.000.6164TTT-0.01-0.020.05-0.18-0.230.040.100.070.640.771.001.00AAAAACAAGAATACAACCACGACTAGAAGCAGGAGT123456789101112表C.8繼續(xù)131415161718192021222324ATAATCATGATTCAACACCAGCATCCACCCCCGCCTlAAA-0.14-0.34-0.020.51-0.070.150.450.25-0.280.090.430.312AAC0.430.18-0.04-0.03-0.10-0.050.390.050.020.190.350.503AAG0.110.060.060.110.20-0.52-0.56-0.150.22-0.38-0.46-0.564AAT-0.590.050.05-0.07-0.200.43-0.18-0.26-0.40-0.42-0.420.525ACA0.510.27-0.220.61-0.420.06-0.110.500.120.28-0.170.506ACC-0.45-0.090.56-0.450.680.420.080.320.201.000.750.387ACG-0.33-0.24-0.28-0.350.40-0.080.24-0.22-0.12-0.20-0.400.278ACT0.500.810.840.500.30-0.22-0.44-0.430.110.270.32-0.419AGA-0.19-0.130.01-0.130.460.73-0.07-0.221.001.00-0.240.6310AGC-0.010.000.10-0.130.230.21-0.28-0.490.46-0.440.421.0011AGG-0.640.480.330.160.08-0.640.670.281.00-0.63-0.331.0012AGT0.02-0.090.760.020.630.62-0.070.70-0.84-0.23-0.31-0.1713ATA0.320.501.000.86-0.670.10-0.17-0.070.05-0.57-0.470.6414ATC0.24-0.28-0.03-0.030.640.240.330.470.300.570.570.1915ATG0.14-0.060.000.030.47-0.23-0.250.230.140.290.05-0.2416ATT-0.18-0.06-0.140.24-0.43-0.310.04-0.26-0.59-0.41-0.21-0.1417CAA0.15-0.10-0.25-0.030.671.000.600.730.371.001.000.7618CAC-0.420.28-0.07-0.110.050.290.54-0.110.551.000.10-0.1819CAG0.590.120.25-0.140.11-0.52-0.46-0.200.091.00-0.45-0.5620CAT0.32-0.070.06-0.02-0.110.21-0.27-0.20-0.30-0.15-0.170.1821CCA1.000.36-0.21-0.520.460.021.001.001.001.001.001.0022CCC1.001.001.000.720.110.200.680.051.001.001.00-0.1223CCG0.47-0.32-0.33-0.24-0.19-0.360.160.06-0.09-0.45-0.20-0.2124CCT1.000.380.880.47-0.520.44-0.050.121.00-0.710.37-0.0625CGA1.00-0.43-0.63-0.560.381.000.550.021.001.001.001.0026CGC0.710.030.030.010.150.05-0.03-0.171.001.000.180.4827CGG-0.56-0.14-0.18-0.200.26-0.10-0.320.12-0.24-0.770.20-0.3428CGT0.580.610.390.16-0.390.17-0.09-0.081.000.19-0.49-0.4729CTA1.00-0.62-0.271.001.001.001.001.001.001.000.411.0030CTC-0.610.100.02-0.040.740.430.540.401.001.000.460.4231CTG-0.48-0.25-0.24-0.19-0.02-0.170.100.22-0.42-0.29-0.31-0.0932CTT0.540.480.280.77-0.30-0.30-0.33-0.38-0.21-0.61-0.23-0.4133GAA-0.01-0.12-0.130.040.110.350.190.270.210.220.230.6434GAC0.380.120.210.130.350.660.290.120.110.510.140.4335GAG-0.200.300.380.00-0.25-0.38-0.30-0.47-0.58-0.48-0.44-0.3436GAT-0.320.12-0.15-0.24-0.39-0.180.01-0.260.000.39-0.310.0437GCA0.20-0.08-0.030.360.080.820.040.58-0.500.170.540.4238GCC-0.45-0.01-0.27-0.35-0.280.400.250.061.000.300.740.3539GCG0.53-0.16-0.17-0.200.22-0.12-0.230.02-0.150.03-0.15-0.4840GCT0.210.560.810.47-0.10-0.480.27-0.440.01-0.51-0.22-0.5241GGA0.150.08-0.12-0.06-0.44-0.210.02-0.091.000.590.02-0.4142GGC0.27-0.100.02-0.190.460.180.130.220.300.130.360.6043GGG-0.020.68-0.24-0.130.660.08-0.41-0.451.00-0.38-0.49-0.3444GGT0.160.270.45-0.15-0.490.050.700.250.190.33-0.19-0.5445GTA0.740.510.300.690.520.57-0.17-0.170.530.620.64-0.0746GTC0.21-0.38-0.32-0.170.350.600.740.380.720.530.700.7847GTG-0.090.010.01-0.090.080.69-0.43-0.170.01-0.47-0.53-0.4348GTT0.06-0.080.330.17-0.39-0.51-0.10-0.420.660.17-0.25-0.3549TAA0.000.000.000.000.000.000.000.000.000.000.000.0050TAC-0.620.220.38-0.120.060.520.23-0.13-0.400.660.230.3651TAG0.000.000.000.000.000.000.000.000.000.000.000.0052TAT-0.400.19-0.250.13-0.130.15-0.12-0.261.000.39-0.410.2953TCA-0.270.49-0.130.32-0.30-0.06-0.030.790.451.000.41-0.0554TCC-0.13-0.33-0.10-0.540.780.320.681.000.070.240.290.6455TCG0.35-0.080.060.090.51-0.60-0.200.00-0.04-0.42-0.21-0.5856TCT0.160.14-0.180.33-0.37-0.45-0.040.09-0.700.44-0.12-0.5757TGA0.000.000.000.000.000.000.000.000.000.000.000.0058TGC1.00-0.440.440.57-0.09-0.37-0.25-0.21-0.011.000.231.0059TGG0.480.330.00-0.310.63-0.23-0.310.231.00-0.710.130.4560TGT-0.350.14-0.390.07-0.050.540.620.64-0.66-0.580.55-0.7561TTA-0.43-0.10-0.16-0.060.510.640.030.430.040.610.560.6362TTC1.00-0.160.150.130.75-0.08-0.20-0.181.001.000.03-0.0963TTG0.530.050.44-0.28-0.22-0.17-0.34-0.341.00-0.360.610.0364TTT0.340.06-0.10-0.190.16-0.22-0.260.45-0.23-0.650.010.33ATAATCATGATTCAACACCAGCATCCACCCCCGCCT131415161718192021222324表C.8繼續(xù)252627282930313233343536CGACGCCGGCGTCTACTCCTGCTTGAAGACGAGGATiAAA-0.560.18-0.220.32-0.170.490.310.39-0.17-0.290.140.272AAC-0.40-0.26-0.39-0.091.000.00-0.170.130.200.50-0.070.333AAG1.00-0.13-0.030.360.15-0.52-0.63-0.640.240.010.480.174AAT-0.640.570.580.061.00-0.35-0.470.04-0.18-0.37-0.01-0.315ACA0.180.500.76-0.090.210.590.180.100.240.10-0.110.206ACC1.000.771.00-0.251.000.240.14-0.060.39-0.310.030.227ACG0.290.67-0.211.001.00-0.14-0.21-0.420.000.610.15-0.068ACT1.00-0.220.50-0.70-0.790.370.64-0.24-0.38-0.21-0.50-0.559AGA0.03-0.160.420.14-0.580.850.560.71-0.200.02-0.220.1810AGC-0.31-0.36-0.340.191.000.62-0.07-0.150.610.260.470.5711AGG1.00-0.01-0.751.001.00-0.36-0.82-0.56-0.430.63-0.540.3812AGT1.000.681.001.001.00-0.45-0.200.09-0.44-0.53-0.65-0.6013ATA-0.490.40-0.430.561.000.63-0.211.00-0.290.570.42-0.0814ATC-0.32-0.590.67-0.12-0.370.170.570.220.230.160.310.3715ATG0.160.360.130.160.46-0.04-0.37-0.230.000.05-0.01-0.0414716ATT-0.160.580.410.390.43-0.21-0.060.17-0.28-0.020.15-0.3817CAA0.370.610.430.150.131.000.870.860.21-0.19-0.050.1818CAC-0.72-0.270.47-0.50-0.400.780.010.460.110.450.190.3819CAG-0.46-0.45-0.39-0.250.37-0.36-0.54-0.47-0.100.04-0.06-0.0220CAT-0.29-0.230.580.69-0.240.140.310.25-0.13-0.37-0.01-0.0921CCA1.00-0.29-0.271.001.001.001.000.240.260.21-0.461.0022CCC1.00-0.65-0.701.001.000.000.430.370.39-0.42-0.340.1023CCG-0.64-0.22-0.040.380.28-0.460.100.190.130.360.360.1924CCT1.00-0.070.48-0.151.000.530.47-0.32-0.33-0.24-0.38-0.5425CGA1.000.32-0.251.001.000.290.190.560.80-0.341.000.1626CGC0.32-0.44-0.18-0.530.170.240.140.260.480.230.250.3527CGG1.000.59-0.371.001.00-0.66-0.31-0.160.170.40-0.150.2528CGT1.000.700.70-0.351.000.840.01-0.49-0.13-0.63-0.43-0.1829CTA1.001.001.001.001.001.001.000.351.001.00-0.21-0.2030CTC-0.440.050.73-0.281.000.320.690.410.41-0.190.27-0.0831CTG-0.01-0.41-0.270.100.40-0.52-0.46-0.290.270.410.110.4432CTT1.000.150.67-0.420.350.490.23-0.10-0.39-0.31-0.46-0.4633GAA0.290.45-0.220.410.570.500.380.56-0.060.02-0.160.0834GAC-0.63-0.34-0.59-0.47-0.020.140.360.010.250.140.030.2135GAG-0.25-0.59-0.46-0.40-0.38-0.42-0.55-0.640.250.130.24-0.2636GAT0.250.880.780.580.15-0.14-0.100.38-0.240.090.26-0.2637GCA1.000.470.79-0.070.240.120.610.570.280.010.06-0.0438GCC0.42-0.20-0.28-0.301.000.500.480.530.340.050.010.0639GCG0.19-0.23-0.360.460.56-0.58-0.29-0.40-0.070.420.120.2840GCT1.001.001.00-0.53-0.660.710.50-0.17-0.40-0.280.01-0.4541GGA0.490.220.24-0.201.000.420.200.28-0.25-0.04-0.190.2542GGC1.00-0.33-0.16-0.23-0.440.010.300.340.27-0.170.160.2643GGG-0.010.22-0.630.321.00-0.49-0.49-0.650.390.600.300.0144GGT0.160.490.500.07-0.520.190.30-0.41-0.22-0.51-0.36-0.0945GTA1.000.540.550.33-0.670.610.330.350.030.14-0.100.2846GTC0.54-0.040.18-0.24-0.170.760.510.710.480.05-0.19-0.1447GTG0.47-0.46-0.370.53-0.29-0.56-0.55-0.470.160.400.540.4148GTT-0.380.500.520.28-0.30-0.400.31-0.09-0.39-0.33-0.34-0.3349TAA0.000.000.000.000.000.000.000.000.000.000.000.0050TAC-0.380.01-0.41-0.721.000.280.100.460.270.360.310.0251TAG0.000.000.000.000.000.000.000.000.000.000.000.0052TAT-0.540.560.790.84-0.490.11-0.080.45-0.26-0.260.010.0053TCA1.00-0.311.000.510.020.490.420.370.06-0.04-0.26-0.0654TCC1.00-0.120.26-0.09-0.39-0.070.510.200.140.370.09-0.1455TCG1.00-0.06-0.60-0.19-0.73-0.58-0.27-0.170.380.57-0.160.1756TCT1.000.171.000.00-0.58-0.090.64-0.32-0.25-0.16-0.36-0.1457TGA0.000.000.000.000.000.000.000.000.000.000.000.0058TGC-0.45-0.60-0.07-0.381.00-0.13-0.080.280.650.350.350.5959TGG1.000.56-0.21-0.21-0.490.24-0.510.05-0.010.100.03-0.0860TGT1.001.001.001.001.000.450.05-0.52-0.39-0.46-0.55-0.3661TTA0.150.480.240.811.000.610.710.760.030.380.300.1262TTC-0.70-0.260.34-0.591.000.320.540.430.480.220.040.5563TTG1.00-0.27-0.62-0.19-0.24-0.16-0.43-0.280.340.59-0.11-0.0264TTT1.000.750.500.45-0.50-0.39-0.070.57-0.30-0.350.15-0.09CGACGCCGGCGTCTACTCCTGCTTGAAGACGAGGAT252627282930313233343536表C.8繼續(xù)373839404142434445464748148<table>tableseeoriginaldocumentpage149</column></row><table>55TCG-0.40-0.43-0.09-0.030.37-0.071.000.580.78-0.250.69-0.3056TCT-0.260.310.520.070.670.650.840.33-0.220.220.50-0.4357TGA0.000.000.000.000.000.000.000.000.000.000.000.0058TGC-0.500.160.030.02-0.46-0.540.47-0.080.200.270.721.0059TGG-0.480.100.250.37-0.190.26-0.510.66-0.190.25-0.180.1160TGT1.000.59-0.410.530.811.001.001.00-0.66-0.43-0.510.1761TTA0.160.600.560.36-0.070.02-0.06-0.21-0.360.310.50-0.1162TTC0.58-0.060.39-0.03-0.59-0.530.06-0.50-0.180.200.220.3763TTG-0.190.170.06-0.250.800.710.000.500.810.210.21-0.1764TTT0.21-0.32-0.20-0.080.850.711.000.580.00-0.450.180.28GCAGCCGCGGCTGGAGGCGGGGGTGTAGTCGTGGTT373839404142434445464748表C.8繼續(xù)495051525354555657585960TAATACTAGTATTCATCCTCGTCTTGATGCTGGTGT1AAA0.00-0.260.000.06-0.31-0.020.080.350.000.020.170.052AAC0.00-0.140.000.180.720.76-0.260.300.000.07-0.13-0.403AAG0.000.400.000.360.090.62-0.340.160.000.08-0.32-0.254AAT0.00-0.080.000.040.10-0.590.310.780.000.410.200.145ACA0.00-0.160.00-0.050.02-0.10-0.310.060.00-0.260.150.036ACC0.00-0.090.000.170.601.000.240.260.00-0.190.611.007ACG0.000.260.000.410.340.360.28-0.140.00-0.14-0.220.168ACT0.00-0.420.00-0.35-0.100.25-0.21-0.670.00-0.27-0.231.009AGA0.001.000.000.22-0.19-0.23-0.15-0.120.000.220.01-0.4110AGC0.00-0.150.000.070.350.78-0.190.200.000.430.020.5911AGG0.001.000.000.340.481.001.000.350.001.00-0.561.0012AGT0.000.320.000.09-0.34-0.22-0.65-0.280.001.000.051.0013ATA0.00-0.600.00-0.37-0.11-0.301.00-0.040.00-0.38-0.521.0014ATC0.000.460.000.400.370.110.500.550.00-0.04-0.130.3515ATG0.000.180.00-0.14-0.430.720.52-0.040.000.580.00-0.4516ATT0.00-0.190.00-0.22-0.23-0.540.04-0.420.000.200.38-0.3817CAA0.00-0.320.00-0.47-0.21-0.47-0.580.140.00-0.17-0.42-0.4318CAC0.000.460.00-0.300.460.70-0.020.110.00-0.110.18-0.2319CAG0.000.590.000.42-0.180.320.420.000.00-0.140.531.0020CAT0.000.470.00-0.29-0.50-0.310.460.470.000.11-0.120.2321CCA0.00-0.330.00-0.430.031.00-0.46-0.580.00-0.170.05-0.4322CCC0.000.190.000.640.210.331.000.510.000.011.001.0023CCG0.000.090.000.07-0.030.500.440.360.000.45-0.31-0.2524CCT0.00-0.250.00-0.01-0.46-0.36-0.05-0.370.001.000.64-0.7025CGA0.00-0.510.00-0.440.29-0.171.000.130.00-0.451.001.0026CGC0.00-0.350.000.040.780.820.18-0.320.00-0.40-0.46-0.6427CGG0.000.400.000.200.581.000.600.480.000.460.551.0028CGT0.00-0.440.00-0.16-0.420.20-0.58-0.680.001.001.001.0029CTA0.001.000.00-0.78-0.60-0.29-0.69-0.030.001.00-0.79-0.8230CTC0.000.170.000.560.240.63-0.190.730.000.490.37-0.3231CTG0.00-0.120.00-0.21-0.050.16-0.350.010.000.42-0.080.5832CTT0.00-0.130.000.45-0.260.31-0.03-0.500.000.040.60-0.5733GAA0.00-0.080.00-0.06-0.01-0.180.090.290.00-0.110.080.1034GAC0.000.320.000.510.760.600.400.320.00-0.34-0.09-0.1835GAG0.000.040.000.320.76-0.29-0.120.260.00-0.19-0.170.5536GAT0.00-0.350.00-0.16-0.11-0.470.120.100.000.580.08-0.0237GCA0.00-0.220.00-0.02-0.13-0.06-0.16-0.230.000.28-0.390.4838GCC0.000.710.000.050.640.550.490.120.00-0.53-0.07-0.55<table>tableseeoriginaldocumentpage151</column></row><table>23CCG-0.250.160.77-0.2124CCT-0.630.72-0.43-0.2225CGA0.33-0.65-0.72-0.0726CGC0.280.130.050.3527CGG1.000.37-0.06-0.4828CGT-0.26-0.42-0.37-0.4029CTA-0.501.00-0.24-0.7530CTC0.480.73-0.020.5931CTG0.780.070.32-0.2932CTT-0.270.06-0.480.6833GAA0.180.01-0.320.0334GAC0.430.330.170.3735GAG0.41-0.260.740.1736GAT-0.35-0.41-0.520.0137GCA-0.46-0.33-0.10-0.2638GCC0.210.660.170.2439GCG0.290.300.780.1740GCT-0.32-0.32-0.59-0.3241GGA-0.090.30-0.39-0.1842GGC0.440.390.490.3143GGG0.530.680.18-0.0544GGT-0.14-0.46-0.51-0.5345GTA-0.10-0.230.03-0.4046GTC0.480.690.800.5347GTG0.040.360.43-0.2448GTT-0.29-0.37-0.22-0.1149TAA0.000.000.000.0050TAC-0.390.310.280.2751TAG0.000.000.000.0052TAT-0.36-0.09-0.48-0.2753TCA-0.17-0.08-0.30-0.2854TCC0.390.820.190.4555TCG0.540.591.00-0.3656TCT-0.66-0.420.02-0.0157TGA0.000.000.000.0058TGC-0.31-0.051.000.8059TGG0.760.371.00-0.2260TGT-0.05-0.120.31-0.5061TTA-0.42-0.04-0.23-0.4062TTC-0.08-0.010.43-0.1063TTG0.68-0.03-0.06-0.4564TTT-0.59-0.15-0.230.22TTATTCTTGTTT61626364152<table>tableseeoriginaldocumentpage153</column></row><table>0.000.000.000.000.000.000.000.000.000.000.000.0050TAC-0.13-0.18-0.23-0.10-0.14-0.04-0.09-0.10-0.020.11-0.150.0451TAG0.000.000.000.000.000.000.000.000.000.000.000.0052TAT0.120.150.220.100.060.13-0.140.170.210.300.070.2653TCA0.030.040.07-0.050.160.300.060.12-0.270.03-0.270.1054TCC-0.28-0.17-0.31-0.13-0.16-0.22-0.12-0.34-0.05-0.14-0.02-0.1455TCG-0.150.03-0.14-0.040.120.330.020.30-0.150.07-0.210.2156TCT0.210.280.270.190.050.030.11-0.070.350.330.240.3057TGA0.000.000.000.000.000.000.000.000.000.000.000.0058TGC-0.12-0.03-0.14-0.10-0.04-0.110.04-0.020.09-0.15-0.170.0059TGG-0.040.020.06-0.01-0.03-0.02-0.110.09-0.170.15-0.150.1860TGT0.000.000.210.080.13-0.06-0.050.020.300.270.160.3661TTA0.140.050.060.10-0.12-0.07-0.10-0.06-0.08-0.03-0.09-0.0762TTC-0.35-0.32-0.34-0.280.03-0.28-0.07-0.19-0.18-0.17-0.17-0.1663TTG-0.28-0.25-0.29-0.150.07-0.18-0.06-0.05-0.13-0.10-0.09-0.0764TTT0.300.280.440.290.26-0.060.070.110.140.130.150.18AAAAACAAGAATACAACCACGACTAGAAGCAGGAGT123456789101112表C.9繼續(xù)131415161718192021222324ATAATCATGATTCAACACCAGCATCCACCCCCGCCT1AAA-0.160.00-0.10-0.060.100.180.050.130.000.10-0.08-0.012AAC0.06-0.050.000.01-0.15-0.07-0.12-0.070.010.060.18-0.023AAG0.110.100.150.13-0.12-0.21-0.10-0.15-0.070.030.110.044AAT0.050.010.00-0.050.060.110.230.010.030.15-0.15-0.105ACA-0.100.27-0.130.070.09-0.02-0.05-0.13-0.050.040.04-0.176ACC-0.19-0.33-0.15-0.240.040.070.200.210.140.210.350.367ACG-0.160.25-0.050.30-0.12-0.19-0.26-0.230.02-0.28-0.35-0.248ACT0.130.040.260.09-0.070.13-0.110.09-0.110.130.050.099AGA-0.140.03-0.06-0.010.100.050.100.120.040.230.080.0410AGC-0.21-0.020.09-0.020.100.050.210.040.360.140.390.2311AGG-0.210.270.100.14-0.090.14-0.18-0.02-0.18-0.29-0.32-0.1912AGT-0.150.140.240.040.250.210.470.180.420.280.330.3013ATA-0.190.17-0.080.180.20-0.080.060.090.05-0.03-0.390.0914ATC-0.06-0.30-0.22-0.140.070.180.020.040.100.240.290.3015ATG0.010.080.00-0.05-0.070.070.16-0.040.07-0.210.070.0216ATT0.180.010.210.06-0.14-0.08-0.04-0.06-0.070.01-0.19-0.1017CAA-0.130.04-0.09-0.010.190.16-0.060.080.000.31-0.21-0.0118CAC0.03-0.05-0.07-0.06-0.07-0.09-0.090.100.050.190.230.3319CAG-0.050.100.220.16-0.05-0.15-0.40-0.220.06-0.03-0.270.0120CAT0.08-0.090.040.040.02-0.010.11-0.02-0.040.17-0.29-0.1621CCA-0.060.14-0.030.030.180.09-0.110.12-0.180.36-0.170.0022CCC-0.28-0.29-0.25-0.210.340.260.310.020.330.490.550.3823CCG-0.180.130.030.07-0.10-0.11-0.45-0.15-0.08-0.070.050.0024CCT0.170.090.210.13-0.18-0.100.02-0.13-0.180.00-0.09-0.0425CGA-0.320.13-0.240.070.110.25-0.20-0.230.160.120.530.0126CGC-0.210.030.050.18-0.09-0.33-0.13-0.300.02-0.25-0.130.0427CGG0.020.27-0.020.260.10-0.09-0.410.120.09-0.47-0.29-0.1128CGT0.160.010.20-0.08-0.02-0.08-0.01-0.140.240.050.110.1929CTA-0.100.10-0.010.19-0.020.00-0.10-0.03-0.07-0.08-0.25-0.0630CTC-0.26-0.240.03-0.160.420.080.250.090.19-0.060.360.1931CTG-0.220.030.030.080.01-0.11-0.15-0.050.22-0.28-0.060.1932CTT0.300.210.420.130.11-0.050.25-0.13-0.09-0.15-0.18-0.08154<table>tableseeoriginaldocumentpage155</column></row><table>17CAA-0.030.22-0.010.12-0.030.26-0.140.22-0.040.11-0.05-0.0318CAC0.12-0.20-0.15-0.320.41-0.060.180.020.06-0.040.080.0119CAG-0.13-0.26-0.35-0.19-0.13-0.11-0.19-0.040.14-0.100.020.0020CAT-0.48-0.19-0.04-0.200.08-0.010.160.01-0.050.000.000.0121CCA0.160.060.010.090.100.200.120.18-0.020.16-0.090.0422CCC0.24-0.010.320.090.19-0.110.290.170.270.190.300.2023CCG-0.14-0.22-0.340.18-0.45-0.03-0.41-0.170.01-0.08-0.13-0.0424CCT-0.38-0.210.06-0.290.200.090.010.01-0.06-0.09-0.08-0.1925CGA0.300.080.040.36-0.380.070.03-0.070.07-0.16-0.090.0026CGC-0.03-0.47-0.49-0.380.15-0.23-0.10-0.010.330.21-0.160.0127CGG0.16-0.22-0.540.29-0.31-0.42-0.43-0.360.130.14-0.010.2228CGT0.01-0.320.05-0.480.04-0.110.220.01-0.05-0.110.07-0.0929CTA-0.290.10-0.10-0.10-0.20-0.13-0.270.06-0.030.10-0.020.0830CTC-0.25-0.220.180.210.19-O.U0.150.050.290.090.08-0.0831CTG-0.42-0.19-0.390.16-0.16-0.11-0.22-0.01-0.08-0.18-0.14-0.0432CTT-0.490.05-0.29-0.04-0.11-0.36-0.03-0.270.150.000.28-0.0333GAA0.010.21-0.140.08-0.130.11-0.060.110.000.020.01-0.0234GAC0.05-0.30-0.02-0.420.180.010.06-0.030.110.110.170.1335GAG0.20-0.16-0.030.07-0.20-0.24-0.30-0.130.04-0.03-0.110.0336GAT0.070.010.35-0.110.230.010.220.02-0.06-0.05-0.07-0.0737GCA0.18-0.070.270.030.010.020.050.19-0.15-0.10-0.09-0.1338GCC0.570.160.43-0.160.13-0.050.21-0.090.330.330.430.3039GCG0.41-0.19-0.080.18-0.37-0.30-0.36-0.24-0.01-0.15-0.11-0.0540GCT0.240.02-0.10-0.130.140.120.31-0.09-0.120.000.05-0.0741GGA0.11-0.17-0.050.16-0.030.01-0.100.020.050.15-0.030.1342GGC0.29-0.330.03-0.380.05-0.20-0.02-0.050.080.040.110.0843GGG-0.16-0.10-0.10-0.17-0.39-0.43-0.42-0.390.180.230.030.3344GGT0.510.260.63-0.080.040.070.410.08-0.13-0.210.05-0.1145GTA-0.51-0.18-0.29-0.270.01-0.12-0.14-0.090.020.02-0.17-0.0746GTC0.49-0.070.37-0.070.26-0.040.17-0.200.230.040.320.2347GTG0.13-0.09-0.360.16-0.10-0.13-0.16-0.11-0.26-0.14-0.32-0.1048GTT0.23-0.110.20-0.310.19-0.170.18-0.250.060.020.150.0049TAA0.000.000.000.000.000.000.000.000.000.000.000.0050TAC0.23-0.19-0.05-0.270.060.00-0.050.180.060.020.070.0851TAG0.000.000.000.000.000.000.000.000.000.000.000.0052TAT-0.04-0.02-0.03-0.180.14-0.020.060.01-0.05-0.10-0.04-0.0153TCA-0.370.02-0.010.27-0.110.14-0.11-0.040.020.150.050.0554TCC-0.110.020.32-0.050.140.180.170.130.330.360.210.3055TCG-0.02-0.220.060.23-0.32-0.10-0.34-0.110.190.270.010.1556TCT-0.17-0.300.20-0.180.060.040.130.01-0.010.090.07-0.0557TGA0.000.000.000.000.000.000.000.000.000.000.000.0058TGC-0.26-0.19-0.46-0.410.16-0.270.01-0.110.230.140.150.1759TGG0.170.510.360.52-0.09-0.09-0.180.050.04-0.10-0.080.0660TGT-0.17-0.150.05-0.230.08-0.190.03-0.03-0.07-0.09-0.20-0.0961TTA-0.22-0.19-0.150.01-0.19-0.08-0.19-0.05-0.020.03-0.020.0462TTC-0.08-0.22-0.01-0.250.09-0.140.150.060.270.230.280.2363TTG0.120.080.130.110.140.270.150.27-0.06-0.03-0.04-0.0264TTT-0.050.13-0.200.380.21-0.030.190.04-0.19-0.14-0.07-0.13CGACGCCGGCGTCTACTCCTGCTTGAAGACGAGGAT252627282930313233343536表C.9繼續(xù)373839404142434445464748GCAGCCGCGGCTGGAGGCGGGGGTGTAGTCGTGGTT<table>tableseeoriginaldocumentpage157</column></row><table><table>tableseeoriginaldocumentpage158</column></row><table><table>tableseeoriginaldocumentpage159</column></row><table><table>tableseeoriginaldocumentpage160</column></row><table><table>tableseeoriginaldocumentpage161</column></row><table>49TAA0.000.000.000.000.000.000.000.000.000.000.000.0050TAC-0.09-0.58-0.610.080.04-0.560.42-0.57-0.570.300.52-0.0951TAG0.000.000.000.000.000.000.000.000.000.000.000.0052TAT0.600.060.430.610.670.420.660.300.590.520.640.6453TCA0.690.220.070.480.370.200.290.52-0.290.190.030.3954TCC0.04-0.68-0.74-0.25-0.17-0.66-0.13-0.59-0.60-0.090.10-0.0255TCG-0.010.17-0.050.380.680.800.590.12-0.230.650.220.2156TCT0.21-0.45-0.360.380.29-0.540.11-0.440.060.670.500.4157TGA0.000.000.000.000.000.000.000.000.000.000.000.0058TGC0.260.620.120.420.110.020.500.700.340.290.760.5159TGG0.15-0.17-0.170.140.59-0.250.16-0.28-0.330.350.170.3360TGT0.09-0.43-0.300.000.45-0.520.23-0.30-0.260.130.400.7061TTA0.57-0.210.060.610.25-0.380.58-0.14-0.190.160.180.3562TTC-0.20-0.61-0.64-0.130.26-0.600.42-0.51-0.58-0.100.44-0.0763TTG-0.27-0.64-0.68-0.280.17-0.650.43-0.59-0.62-0.040.21-0.0664TTT0.680.330.470.610.540.070.470.340.320.390.430.53AAAAACAAGAATACAACCACGACTAGAAGCAGGAGT123456789101112表C.10繼續(xù)131415161718192021222324ATAATCATGATTCAACACCAGCATCCACCCCCGCCT1AAA0.520.140.160.150.220.250.340.540.040.540.590.322AAC0.53-0.61-0.24-0.41-0.45-0.530.05-0.33-0.500.400.18-0.193AAG0.62-0.56-0.18-0.31-0.41-0.550.35-0.21-0.510.160.57-0.164AAT0.710.030.220.390.260.440.630.520.200.090.440.365ACA0.680.510.290.410.600.330.410.220.010.500.940.526ACC0.44-0.61-0.54-0.59-0.26-0.490.330.08-0.400.130.450.307ACG0.540.120.180.630.140.430.340.420.090.000.440.558ACT0.38-0.300.41-0.12-0.48-0.440.430.08-0.540.460.60-0.129AGA0.47-0.48-0.21-0.32-0.26-0.410.210.06-0.540.440.44-0.1110AGC0.360.020.590.540.090.130.390.510.540.561.000.1011AGG0.500.460.280.270.510.250.390.390.190.580.470.5012AGT0.630.520.540.190.520.630.790.420.610.900.620.4813ATA0.660.510.370.680.790.780.680.490.530.590.770.6514ATC0.41-0.64-0.47-0.46-0.44-0.390.380.09-0.390.350.43-0.0215ATG0.71-0.250.00-0.19-0.05-0.190.110.13-0.330.270.590.2616ATT0.62-0.240.28-0.13-0.30-0.510.210.17-0.480.300.39-0.0217CAA0.54-0.50-0.22-0.16-0.19-0.170.26-0.08-0.370.360.42-0.0518CAC0.69-0.52-0.45-0.46-0.27-0.61-0.170.28-0.210.780.320.1219CAG0.850.270.600.290.220.370.070.230.450.600.37-0.0220CAT0.480.090.460.230.070.130.410.26-0.270.580.62-0.1921CCA0.53-0.52-0.38-0.42-0.12-0.210.020.36-0.470.530.74-0.2222CCC0.58-0.360.360.190.350.120.090.670.060.651.000.2923CCG0.690.420.430.620.590.260.490.790.40-0.090.720.7824CCT0.780.130.450.13-0.33-0.540.28-0.28-0.220.210.100.1425CGA0.330.760.700.790.900.510.781.000.681.001.001.0026CGC0.47-0.010.100.75-0.10-0.43-0.380.350.430.511.000.0127CGG1.000.791.000.530.821.000.621.001.000.301.001.0028CGT0.43-0.35-0.18-0.32-0.56-0.600.540.03-0.38-0.030.470.0529CTA0.610.580.450.310.12-0.230.04-0.10-0.28-0.120.900.1430CTC0.580.460.470.330.740.590.660.890.650.820.770.7231CTG0.60-0.100.350.120.16-0.170.27-0.040.25-0.200.880.4632CTT0.850.630.530.540.300.260.690.590.460.270.590.34<table>tableseeoriginaldocumentpage163</column></row><table>17CAA0.910.490.85-0.160.180.510.110.56-0.21-0.100.10-0.0918CAC1.001,000.51-0.540.570.660.460.24-0.27-0.42-0.030.0119CAG0.44-0.110.04-0.050.240.470.370.710.26-0.070.480.3920CAT0.68-0.420.72-0.320.320.530.190.320.13-0.170.200.3221CCA0.730.360.54-0.13-0.080.520.310.31-0.43-0.240.07-0.1022CCC1.00-0.390.420.650.220.590.560.720.670.290.680.5223CCG0.571.000.261.000.56-0.04-0.100.180.410.260.430.0424GCT0.82.0.040.70-0.41.0.100.440.810.030.040.45-0.0725CGA1.001.001.001.000.830.191.000.640.730.880.740.8026CGC1.000.23-0.09-0.49-0.010.520.491.000.670.710.850.2227CGG-0.461.001.001.000.42-0.27-0.090.690.810.581.000.8928CGT1.001.001.00-0.75-0.110.80-0.390.08-0.48-0.400.15-0.2529CTA0.841.000.72-0.53-0.050.76-0.070.360.130.030.290.1130CTC0.620.081.000.800.761.000.850.750.690.040.850.7231CTG0.380.510.300.270.05-0.17-0.270.390.13-0.260.300.1732CTT0.830.79-0.320.390.090.240.120.490.420.100.700.5033GAA0.760.710.42-0.31-0.180.610.160.50-0.20-0.180.18-0.0534GAC0.57-0.020.51-0.700.420.630.270.55-0.12-0.210.110.1535GAG0.77-0.080.610.02-0.070.150.210.300.300.160.270.3236GAT0.920.630.87-0.390.270.530.430.40-0.09-0.210.330.1437GCA0.790.510.120.000.640.820.670.480.440.180.410.5038GCC1.000.500.76-0.380.170.340.56-0.030.14-0.120.470.1839GCG1.000.361.000.180.33-0.100.230.430.570.140.100.4540GCT0.830.500.86-0.490.020.460.180.20-0.48-0.460.08-0.2041GGA0.690.821.000.450.570.630.550.710.580.630.600.5742GGC0.82-0.211.00-0.500.460.360.370.420.360.450.410.2743GGG0.430.660.510.28-0.08-0.030.080.230.780.740.580.7844GGT0.840.810.87-0.57-0.270.490.300.14-0.48-0.54-0.05-0.2745GTA0.820.580.700.020.610.540.560.340.290.360.190.4646GTC0.810.091.00-0.590.330.600.15-0.03-0.22-0.440.360.0447GTG0.800.061.000.300.260.590.170.16-0.02-0.090.040.2848GTT1.000.050.83-0.560.250.270.500.04-0.24-0.410.410.0949TAA0.000.000.000.000.000.000.000.000.000.000.000.0050TAC0.290.160.70-0.660.250.250.050.37-0.21-0.33-0.03-0.0851TAG0.000.000.000.000.000.000.000.000.000.000.000.0052TAT0.870.840.54-0.020.400.500.240.550.10-0.110.260.3453TCA1.000.640.230.240.230.500.330.590.320.520.480.3654TCC0.800.031.00-0.600.190.430.520.29-0.20-0.29-0.06-0.0155TCG1.000.221.000.340.080.64-0.310.430.540.070.310.2856TCT0.63-0.021.00-0.63-0.290.320.310.29-0.45-0.480.34-0.0957TGA0.000.000.000.000.000.000.000.000.000.000.000.0058TGC1.00-0.411.000.280.580.340.47-0.040.050.060.700.9459TGG1.000.420.580.510.230.54-0.180.79-0.15-0.290.410.2160TGT1.00-0.361.00-0.650.14-0.02-0.290.36-0.20-0.190.09-0.3261TTA0.750.700.860.070.020.820.280.260.150.250.380.2862TTC0.670.031.00-0.570.08-0.010.280.21-0.10-0.220.200.0363TTG0.830.700.72-0.39-0.290.640.190.59-0.52-0.49-0.08-0.2864TTT0.890.200.810.600.590.830.600.72-0.10-0.170.290.19CGACGCCGGCGTCTACTCCTGCTTGAAGACGAGGAT252627282930313233343536表C.10繼續(xù)373839404142434445464748GCAGCCGCGGCTGGAGGCGGGGGTGTAGTCGTGGTT<table>tableseeoriginaldocumentpage165</column></row><table>56TCT0.15-0.610.21-0.590.590.350.43-0.610.34-0.56-0.29-0.4157TGA0.000.000.000.000.000.000.000.000.000.000.000.0058TGC0.750.200.560.590.900.33-0.070.221.000.871.000.3959TGG0.46-0.200.44-0.260.620.48-0.21-0.310.58-0.060.53-0.3660TGT0.63-0.47-0.48-0.380.76-0.030.23-0.480.19-0.450.01-0.4661TTA0.490.160.36-0.050.480.290.54-0.250.52-0.220.040.0362TTC0.470.030.500.200.700.180.63-0.510.73-0.440.17-0.0163TTG0.18-0.580.08-0.620.480.050.50-0.590.32-0.58-0.16-0.5064TTT0.20-0.400.25-0.260.730.240.47-0.230.53-0.290.34-0.23GCAGCCGCGGCTGGAGGCGGGGGTGTAGTCGTGGTT373839404142434445464748表C.10繼續(xù)495051525354555657585960TAATACTAGTATTCATCCTCGTCTTGATGCTGGTGT1AAA0.00-0.140.000.310.100.120.26-0.180.000.34-0.030.012AAC0.00-0.530.00-0.15-0.03-0.48-0.17-0.400.000.25-0.43-0.573AAG0.00-0.430.000.320.35-0.370.07-0.420.000.240.05-0.314AAT0.000.340.000.440.460.080.340.110.000.600.510.415ACA0.000.310.000.630.410.160.460.190.000.660.27-0.166ACC0.00-0.540.000.240.25-0.45-0.15-0.440.00-0.33-0.39-0.427ACG0.00-0.020.000.660.560.700.520.460.000.750.570.538ACT0.00-0.500.000.11-0.11-0.49-0.11-0.540.000.90-0.08-0.349AGA0.00-0.310.000.430.05-0.440.37-0.410.000.46-0.17-0.3410AGC0.000.230.000.600.390.010.560.170.000.200.100.5111AGG0.000.230.000.540.000.120.490.190.001.000.610.7112AGT0.000.140.000.630.410.510.570.470.000.450.440.4913ATA0.000.250.000.560.430.370.440.500.000.710.760.5314ATC0.00-0.490.00-0.06-0.06-0.570.03-0.530.000.37-0.44-0.4615ATG0.00-0.240.000.230.31-0.330.28-0.270.000.290.00-0.1516ATT0.00-0.210.000.180.34-0.450.14-0.390.00-0.06-0.02-0.1717CAA0.00-0.270.000.200.36-0.270.59-0.330.000.44-0.20-0.3018CAC0.00-0.310.00-0.090.52-0.60-0.14-0.600.000.45-0.31-0.3219CAG0.00-0.210.000.350.130.160.330.080.000.190.550.2520CAT0.00-0.060.000.290.300.050.42-0.160.000.080.250.0621CCA0.00-0.450.000.260.15-0.390.24-0.580.000.16-0.48-0.4522CCC0.000.470.000.140.360.040.30-0.090.000.280.900.1223CCG0.000.520.000.560.760.780.820.800.000.540.880.4424CCT0.00-0.360.000.240.34-0.39-0.18-0.090.00-0.090.400.5525CGA0.000.840.000.631.001.000.630.720.001.000.811.0026CGC0.000.030.000.411.00-0.28-0.250.160.000.170.541.0027CGG0.000.440.000.58-0.150.601.000.760.001.000.670.2628CGT0.00-0.680.00-0.470.30-0.53-0.25-0.500.00-0.44-0.51-0.6929CTA0.00-0.180.000.080.320.030.04-0.050.000.31-0.38-0.3030CTC0.000.010.000.550.60-0.350.820.210.000.670.590.0031CTG0.000.100.000.520.480.410.340.260.000.310.42-0.2232CTT0.00-0.250.000.150.17-0.39-0.10-0.250.000.560.39-0.0733GAA0.00-0.260.000.230.55-0.440.53-0.380.000.30-0.12-0.2634GAC0.00-0.480.000.070.31-0.570.00-0.580.00-0.01-0.25-0.5635GAG0.00-0.170.000.220.58-0.060.030.130.000.370.320.1536GAT0.000.090.000.280.390.02-0.110.000.000.700.180.2437GCA0.000.240.000.590.32-0.120.210.070.000.610.350.3338GCC0.00-0.240.000.450.06-0.56-0.10-0.400.000.680.08-0.2639GCG0.000.000.000.340.600.390.570.510.000.39-0.020.75<table>tableseeoriginaldocumentpage167</column></row><table><table>tableseeoriginaldocumentpage168</column></row><table><table>tableseeoriginaldocumentpage169</column></row><table>49TAA0.000.000.000.000.000.000.000.000.000.000.000.0050TAC0.47-0.08-0.070.830.200.00-0.160.920.75-0.200.790.7051TAG0.000.000.000.000.000.000.000.000.000.000.000.0052TAT1.000.910.901.000.390.820.711.000.440.740.670.0353TCA0.120.890.700.38-0.570.730.29-0.500.350.13-0.14-0.5454TCC0.30-0.17-0.020.280.01-0.07-0.12-0.280.05-0.04-0.04-0.0355TCG0.030.240.000.21-0.240.32-0.300.270.210.490.380.2056TCT1.000.911.00-0.05-0.320.620.461.00-0.550.580.781.0057TGA0.000.000.000.000.000.000.000.000.000.000.000.0058TGC-0.10-0.07-0.07-0.26-0.210.02-0.120.400.040.290.42-0.0959TGG-0.36-0.010.030.34-0.200.15-0.23-0.200.47-0.020.510.0060TGT-0.460.970.81-0.561.000.51-0.10-0.500.120.920.310.6461TTA1.001.001.001.00-0.760.840.671.00-0.900.461.001.0062TTC-0.20-0.020.000.190.58-0.130.220.280.21-0.010.330.2063TTG-0.700.24-0.23-0.600.040.35-0.20-0.32-0.71-0.22-0.49-0.5464TTT-0.050.950.731.00-0.070.720.62-0.55-0.410.610.001.00AAAAACAAGAATACAACCACGACTAGAAGCAGGAGT123456789101112表C.ll繼續(xù)131415161718192021222324ATAATCATGATTCAACACCAGCATCCACCCCCGCCT1AAA-0.500.740.22-0.69-0.42-0.21-0.24-0.69-0.67-0.47-0.58-0.572AAC0.58-0.05-0.030.210.37-0.020.000.470.550.23-0.030.223AAG0.46-0.05-0.010.230.50-0.01-0.010.370.360.060.040.334AAT1.000.750.83-0.39-0.74-0.10-0.18-0.74-0.82-0.62-0.72-0.705ACA1.000.850.68-0.07-0.67-0.24-0.44-0.76-0.89-0.73-0.75-0.806ACC0.590.110.16-0.100.150.090.04-0.210.400.380.180.157ACG0.18-0.26-0.30-0.35-0.15-0.050.02-0.270.05-0.15-0.10-0.298ACT0.310.820.86-0.32-0.700.13-0.37-0.78-0.87-0.75-0.75-0.909AGA-0.300.390.37-0.67-0.74-0.13-0.13-0.17-0.86-0.19-0.28-0.7910AGC1.000.130.30-0.170.12-0.12-0.07-0.480.040.080.12-0.3711AGG0.720.000.06-0.45-0.640.20-0.02-0.30-0.62-0.23-0.10-0.1812AGT0.510.700.85-0.52-0.33-0.22-0.20-0.69-0.65-0.30-0.37-0.7913ATA1.000.870.91-0.570.430.120.06-0.250.44-0.67-0.65-0.6614ATC0.82-0.05-0.04-0.120.440.00-0.010.180.500.180.080.4315ATG0.97-0.030.000.170.15-0.01-0.010.170.090.07-0.060.0716ATT1.000.730.80-0.19-0.83-0.44-0.24-0.58-0.91-0.79-0.81-0.9117CAA0.630.800.77-0.17-0.320.220.020.45-0.19-0.25-0.19-0.7218CAC-0.25-0.06-0.050.370.200.010.000.040.280.030.200.1419CAG-0.18-0.04-0.040.070.01-0.020.000.12-0.24-0.050.09-0.0220CAT0.190.790.650.57-0.68-0.08-0.06-0.37-0.59-0.60-0.65-0.6721CCA-0.580.960.69-0.83-0.700.03-0.37-0.42-0.93-0.67-0.73-0.8722CCC-0.180.300.25-0.280.320.560.440.210.290.820.480.0723CCG-0.29-0.21-0.20-0.18-0.18-0.30-0.230.07-0.35-0.28-0.18-0.1624CCT0.300.840.75-0.78-0.70-0.08-0.23-0.65-0.84-0.60-0.55-0.8425CGA-0.600.630.64-0.59-0.600.34-0.18-0.58-0.78-0.06-0.34-0.6526CGC-0.090.030.080.050.130.310.24-0.050.280.050.340.3927CGG-0.37-0.12-0.21-0.26-0.27-0.29-0.18-0.22-0.01-0.150.08-0.0628CGT0.350.470.610.11-0.610.21-0.04-0.56-0.67-0.49-0.46-0.7129CTA-0.440.750.83-0.48-0.800.14-0.34-0.67-0.88-0.54-0.47-0.8230CTC0.070.230.30-0.01-0.230.54-0.380.520.560.410.570.5431CTG0.16-0.14-0.170.28-0.48-0.270.560.320.25-0.32-0.070.1932CTT0.600.890.85-0.21-0.690.08-0.72-0.39-0.86-0.75-0.64-0.8233GAA0.500.800.710.40-0.35-0.15-0.150.20-0.64-0.38-0.48-0.4334GAC-0.06-0.05-0.040.360.340.040.010.010.460.240.000.3735GAG-0.17-0.13-0.110.200.220.010.030.320.260.210.080.4236GAT0.160.860.75-0.44-0.68-0.36-0.27-0.68-0.80-0.66-0.72-0.7437GCA0.260.830.81-0.42-0.69-0.33-0.47-0.58-0.84-0.70-0.72-0.7838GCC-0.040.220.170.220.200.140.11-0.090.440.330.300.2539GCG-0.49-0.32-0.29-0.180.12-0.11-0.030.080.05-0.130.010.0840GCT0.470.860.85-0.12-0.75-0.03-0.42-0.80-0.85-0.77-0.70-0.8641GGA-0.150.540.40-0.49-0.57-0.37-0.32-0.41-0.73-0.33-0.51-0.5742GGC-0.13-0.16-0.120.150.410.260.260.250.360.240.290.3643GGG-0.220.13-0.02-0.28-0.47-0.36-0.41-0.51-0.38-0.16-0.27-0.4244GGT0.720.620.680.16-0.06-0.040.01-0.46-0.38-0.30-0.38-0.5245GTA0.110.880.85-0.40-0.670.21-0.34-0.32-0.47-0.50-0.68-0.5346GTC0.100.050.150.190.430.250.140.110.710.330.370.5447GTG0.35-0.16-0.230.210.28-0.27-0.140.270.04-0.18-0.250.3048GTT0.580.810.70-0.44-0.55-0.17-0.39-0.77-0.86-0.75-0.72-0.8549TAA0.000.000.000.000.000.000.000.000.000.000.000.0050TAC0.27-0.05-0.040.400.24-0.050.000.790.330.350.000.6351TAG0.000.000.000.000.000.000.000.000.000.000.000.0052TAT0.130.750.80-0.46-0.60-0.22-0.181.00-0.81-0.69-0.80-0.6553TCA-0.290.820.87-0.74-0.530.13-0.40-0.56-0.87-0.65-0.74-0.8354TCC-0.120.090.110.120.340.230.140.310.280.280.180.2055TCG-0.43-0.28-0.38-0.310.19-0.12-0.070.130.04-0.10-0.050.0956TCT-0.720.800.87-0.220.40-0.25-0.35-0.79-0.37-0.70-0.62-0.9057TCA0.000.000.000.000.000.000.000.000.000.000.000.0058丁GC0.22-0.07-0.04-0.450.020.020.00-0.11-0.060.100.040.1859TGG-0.180.010.00-0.010.27-0.02-0.010.200.040.00-0.010.1960TGT1.000.910.41-0.29-0.420.020.01-0.69-0.67-0.31-0.44-0.4661TTA1.001.001.00-0.91-0.861.00-0.31-0.88-0.91-0.36-0.611.0062TTC0.41-0.02-0.010.200.47-0.01-0.020.350.700.050.030.3363TTG0.21-0.05-0.19-0.68-0.440.05-0.60-0.30-0.68-0.01-0.16-0.4364TTT1.000.720.85-0.84-0.79-0.34-0.21-0.84-0.89-0.83-0.71-0.94ATAATCATGATTCAACACCAGCATCCAcccCCGCCT131415161718192021222324表C.ll繼續(xù)252627282930313233343536CGACGCCGGCGTCTACTCCTGCTTGAAGACGAGGAT1AAA-0.82-0.59-0.56-0.28-0.85-0.65-0.45-0.580.200.670.500.102AAC0.02-0.01-0.050.310.60-0.170.150.080.14-0.03-0.040.273AAG0.190.12-0.050.340.37-0.070.130.29-0.10-0.05-0.010.384AAT-0.680.030.24-0.43-0.74-0.56-0.38-0.80-0.440.480.46-0.655ACA-0.80-0.08-0.34-0.69-0.63-0.35-0.36-0.68-0.090.570.55-0.076ACC0.07-0.09-0.12-0.210.800.000.250.02-0.07-0.05-0.12-0.297ACG0.080.430.350.020.44-0.16-0.23-0.270.220.100.20-0.168ACT-0.58-0.18-0.05-0.40-0.59-0.44-0.08-0.86-0.160.500.49-0.499AGA-0.76-0.08-0.12-0.22-0.37-0.39-0.17-0.78-0.400.560.47-0.3510AGC-0.41-0.16-0.11-0.420.890.120.23-0.36-0.20-0.13-0.25-0.4911AGG-0.440.280.10-0.11-0.21-0.130.01-0.42-0.480.190.23-0.4112AGT-0.420.470.53-0.51-0.450.110.02-0.62-0.50-0.03-0.12-0.7113ATA-0.75-0.53-0.60-0.72-0.59-0.39-0.15-0.370.210.520.280.2814ATC0.47-0.060.190.080.66-0.220.18-0.180.22-0.03-0.060.1815ATG0.10-0.080.020.160.75-0.130.050.280.15-0.01-0.030.2316ATT-0.77-0.300.12-0.661.00-0.50-0.38-0.910.360.560.51-0.73m<table>tableseeoriginaldocumentpage172</column></row><table><table>tableseeoriginaldocumentpage173</column></row><table>56TCT-0.370.120.02-0.87-0.480.660.20-0.32-0.050.240.56-0.7157TGA0.000.000.000.000.000.000.000.000.000.000.000.0058TGC0.190.12-0.100.06-0.130.05-0.23-0.19-0.060.14-0.04-0.0159TGG-0.030.14-0.190.300.05-0.090.200.210.260.16-0.200.2460TGT-0.10-0.23-0.32-0.500.050.710.560.37-0.08-0.23-0.49-0.6961TTA1.000.920.24-0.84-0.510.88-0.161.00-0.610.580.031.0062TTC0.39-0.090.090.410.31-0.070.080.050.25-0.090.090.1663TTG-0.340.430.21-0.360.130.570.280.150.480.480.18-0.1864TTT0.280.240.48-0.76-0.070.630.680.36-0.320.470.49-0.71GCAGCCGCGGCTGGAGGCGGGGGTGTAGTCGTGGTT373839404142434445464748表C.11繼續(xù)495051525354555657585960TAATACTAGTATTCATCCTCGTCTTGATGCTGGTGT1AAA0.000.200.00-0.13-0.480.260.57-0.480.000.33-0.35-0.402AAC0.00-0.040.000.450.410.020.220.080.00-0.06-0.030.463AAG0.00-0.040.000.620.270.01-0.100.260.00-0.050.030.394AAT0.000.460.001.00-0.43-0.07-0.07-0.750.000.580.75-0.535ACA0.000.570.00-0.37-0.640.13-0.02-0.790.000.150.17-0.596ACC0.000.120.00-0.28-0.200.13-0.12-0.170.00-0.17-0.22-0.397ACG0.00-0.200.00-0.50-0.160.180.05-0.340.000.550.56-0.078ACT0.000.680.00-0.08-0.76-0.22-0.45-0.900.000.390.161.009AGA0.000.740.00-0.42-0.860.16-0.31-0.800.000.040.110.1210AGC0.000.220.00-0.540.350.430.42-0.170.000.05-0.07-0.2711AGG0.000.390.00-0.37-0.620.15-0.29-0.580.000.240.48-0.1812AGT0.000.120.00-0.51-0.59-0.27-0.56-0.780.00-0.270.08-0.5113ATA0.000.510.000.180.38-0.38-0.43-0.760.000.370.220.2114ATC0.00-0.030.000.180.490.010.240.290.00-0.05-0.010.3115ATG0.00-0.020.000.390.340.11-0.01-0.030.00-0.030.000.3616ATT0.000.300.00-0.43-0.75-0.240.02-0.700.000.830.45-0.4117CAA0.000.800.00-0.13-0.390.470.491.000.000.580.111.0018CAC0.00-0.050.00-0.050.07-0.020.210.420.00-0.08-0.060.2719CAG0.00-0.070.000.630.09-0.02-0.060.390.00-0.07-0.010.4020CAT0.000.640.000.260.15-0.060.24-0.480.000.660.790.6921CCA0.000.720.000.41-0.750.430.34-0.810.000.110.15-0.7222CCC0.000.120.00-0.15-0.170.00-0.11-0.510.00-0.29-0.25-0.4223CCG0.00-0.110.00-0.09-0.08-0.01-0.010.170.000.330.230.4024CCT0.000.580.00-0.61-0.83-0.53-0.50-0.860.000.130.45-0.7625CGA0.000.410.00-0.46-0.720.240.14-0.720.00-0.17-0.05-0.6526CGC0.00-0.190.00-0.43-0.18-0.110.060.050.00-0.21-0.31-0.3827CGG0.000.190.00-0.020.020.320.170.320.000.330.410.1128CGT0.000.340.00-0.11-0.66-0.42-0.42-0.640.000.380.58-0.4229CTA0.000.590.00-0.76-0.720.710.49-0.500.000.68-0.10-0.6830CTC0.00-0.160.000.040.52-0.230.420.310.000.400.430.3731CTG0.000.080.000.370.43-0.110.050.520.00-0.23-0.200.2732CTT0.000.580.00-0.53-0.62-0.37-0.18-0.830.000.720.750.3033GAA0.000.650.000.45-0.350.440.420.010.000.210.160.7534GAC0.00-0.050.000.390.330.020.090.510.00-0.06-0.040.2735GAG0.00-0.130.000.560.15-0.02-0.170.570.00-0.09-0.030.4536GAT0.000.600.000.37-0.78-0.08-0.17-0.680.000.640.880.5937GCA0.000.560.000.28-0.530.13-0.10-0.680.000.320.05-0.3238GCC0.000.000.00-0.16-0.04-0.12-0.11-0.090.00-0.18-0.19-0.4139GCG0.00-0.070.00-0.100.050.060.030.050.000.380.330.2440GCT0.000.590.000.00-0.75-0.53-0.52-0.870.000.440.52-0.5241GGA0.00-0.010.00-0.29-0.470.32-0.090.200.000.220.23-0.4642GGC0.000.010.00-0.320.130.120.040.240.00-0.11-0.21-0.3243GGG0.00-0.070.00-0.06-0.070.330.000.150.000.250.38-0.1244GGT0.000.230.000.26-0.56-0.26-0.49-0.700.000.570.800.0145GTA0.000.880.000.470.380.550.46-0.460.000.750.35-0.1346GTC0.00-0.100.00-0.070.40-0.220.160.170.00-0.040.140.2147GTG0.000.050.000.210.270.19-0.030.470.00-0.07-0.200.2448GTT0.000.640.00-0.24-0.83-0.22-0.33-0.490.000.680.810.2849TAA0.000.000.000.000.000.000.000.000.000.000.000.0050TAC0.00-0.080.000.940.260.10-0.110.800.00-0.09-0.030.6251TAG0.000.000.000.000.000.000.000.000.000.000.000.0052TAT0.000.730.000.61-0.460.570.041.000.000.620.640.4553TCA0.000.780.001.00-0.630.38-0.12-0.850.000.180.070.4154TCC0.00-0.010.000.280.10-0.07-0.24-0.020.00-0.25-0.22-0.0555TCG0.00-0.210.000.220.010.10-0.070.070.000.450.450.4256TCT0.000.760.00-0.46-0.62-0.52-0.55-0.900.000.410.481.0057TGA0.000.000.000.000.000.000.000.000.000.000.000.0058TGC0.00-0.020.00-0.250.030.06-0.20-0.030.00-0.03-0.07-0.0859TGG0.00-0.010.000.15-0.140.15-0.170.390.00-0.030.000.2760TGT0.000.430.00-0.26-0.72-0.31-0.45-0.840.000.430.76-0.4261TTA0.001.000.00-0.94-0.850.670.52-0.910.00-0.45-0.691.0062TTC0.00-0.020.000.380.40-0.120.140.400.00-0.03-0.010.2963TTG0.000.080.00-0.55-0.73-0.10-0.27-0.570.000.01-0.44-0.5264TTT0.000.470.00-0.48-0.48-0.02o.n-0.690.000.570.62-0.69TAATACTAGTATTCATCCTCGTCTTGATGCTGGTGT495051525354555657585960表C.ll繼續(xù)61TTA62TTC63TTG64TTT1AAA1.000.43-0.28-0.732AAC0.08-0.020.780.573AAG1.00-0.020.27-0.024AAT1.000.350.44-0.655ACA1.000.65-0.14-0.516ACC0.62-0.120.280.017ACG-0.370.22-0.25-0.018ACT-0.930.25-0.57-0.829AGA1.000.48-0.14-0.7610AGC1.000.180.51-0.4111AGG-0.440.13-0.480.2012AGT1.000.150.60-0.8013ATA1.000.41-0.05-0.4314ATC1.00-0.010.760.1815ATG1.00-0.010.630.3816ATT1.000.25-0.70-0.9117CAA-0.900.77-0.69-0.8518CAC-0.19-0.040.490.1619CAG-0.46-0.04-0.740.3920CAT1.000.50-0.311.0021CCA-0.840.71-0.64-0.5922CCC-0.37-0.15-0.07-0.1523CCG0.040.09-0.58-0.0717524CCT1.000.34-0.41-0.6525CGA-0.880.41-0.39-0.6926CGC0.51-0.090.64-0.3027CGG-0.430.09-0.510.2128CGT-0.72-0.050.03-0.5629CTA-0.950.56-0.70-0.6730CTC0.32-0.090.710.0431CTG0.290.040.020.5432CTT-0.940.36-0.56-0.7933GAA-0.290.650.190.0034GAC1.00-0.030.620.3835GAG-0.12-0.11-0.520.2636GAT-0.570.40-0.390.3737GCA-0.590.68-0.52-0.5538GCC0.26-0.100.47-0.1239GCG-0.520.09-0.43-0.2040GCT-0.850.36-0.53-0.7041GGA-0.500.190.08-0.3342GGC0.65-0.040.63-0.1243GGG-0.480.05-0.43-0.1944GGT-0.350.050.31-0.5845GTA1.000.82-0.210.6546GTC0.61-0.180.840.5747GTG0.640.200.320.4148GTT1.000.26-0.53-0.2449TAA0.000.000.000.0050TAC0.65-0.020.600.3751TAG0.000.000.000.0052TAT-0.860.37-0.31-0.4153TCA-0.870.65-0.42-0.8154TCC0.27-0.240.12-0.1055TCG-0.060.25-0.520.2556TCT-0.920.33-0.72-0.6757TGA0.000.000.000.0058TGC-0.52-0.020.50-0.2159TGG-0.170.00-0.01-0.1160TGT1.000.31-0.25-0.2261TTA1.000.72-0.821.0062TTC0.16-0.020.630.5163TTG1.000.20-0.74-0.5564TTT1.000.45-0.86-0.47TTATTCTTGTTT61626364申請人或代理人文件參考編號25051WO國際申請?zhí)柵c被保藏的微生物相關(guān)的說明(PCTRule13to)A.下文說明涉及英文說明書第26頁24行提到的微生物B.保藏證明還有其它保藏記載于附頁上(是)保藏單位名稱CENTRAALBUREAUVOORSCHIMMELCULTURES保藏單位地址(包括郵政編碼和國家)Uppsalalaan8P.O.Box85167NL-3508ADUtrechtTheNetherlands保藏日期10-08-1988編號CBS513.88C.其它說明0^7^"適y^^,5^0該信息還有附頁繼續(xù)說明()我方通知貴方,根據(jù)Rulel3bisPCT,在國家專利授權(quán)公告公開之前,上述微生物,僅可由請求人指定的專家獲得樣品,如果該申請被駁回、撤回或視為撤回,這條規(guī)定將從申請日起算二十年內(nèi)有效。_D.說明適用的指定國家和地區(qū)0^菜說剪不是^對,夯房家浙巡區(qū)游話JE.對說明的單獨補充0^7^^^^^y像空S)下述說明將隨后提交至國際局(指明說明的常見性質(zhì),例如"保藏號")接收局專用()該表隨國際申請收到國際局專用()該表于下述日期被國際局收負(fù)責(zé)官員負(fù)責(zé)官員(簽名)申請人或代理人文件參考編號25051WO國際申請?zhí)杕與被保藏的微生物相關(guān)的說明(PCTRule13to)A.下文說明涉及英文說明書第26頁26行提到的微生物_B.保藏證明_還有其它保藏記載于附頁上(是)保藏單位名稱CENTRAALBUREAUVOORSCHIMMELCULTURES_保藏單位地址(包括郵政編碼和國家)Uppsalalaan8P.O.Box85167NL-3508ADUtrechtTheNetherlands保藏日期02-06-1995編號CBS455.95C.其它說明p^^^^^^y像s^^該信息還有附頁繼續(xù)說明()我方通知貴方,根據(jù)Rulel3bisPCT,在國家專利授權(quán)公告公開之前,上述微生物,僅可由請求人指定的專家獲得樣品,如果該申請被駁回、撤回或視為撤回,這條規(guī)定將從申請日起算二十年內(nèi)有效。_D.說明適用的指定國家和地區(qū)0^菜說敏不^#對屏存嵐家浙邀^游話^E.對說明的單獨補充Q^^"^^^y歡《aO_下述說明將隨后提交至國際局(指明說明的常見性質(zhì),例如"保藏號")接收局專用()該表隨國際申請收到國際局專用()該表于下述日期被國際局收負(fù)責(zé)官員負(fù)責(zé)官員(簽名)178權(quán)利要求1.優(yōu)化編碼預(yù)定的氨基酸序列的核苷酸序列的方法,其中針對在預(yù)定的宿主細(xì)胞中的表達(dá)對所述編碼序列進(jìn)行優(yōu)化,所述方法包括(a)產(chǎn)生至少一條編碼所述預(yù)定的氨基酸序列的原始編碼序列;(b)通過用同義密碼子替換該至少一條原始編碼序列中的一個或多個密碼子,從該至少一條原始編碼序列產(chǎn)生至少一條新產(chǎn)生的編碼序列;(c)測定所述至少一條原始編碼序列的適合度值和所述至少一條新產(chǎn)生的編碼序列的適合度值,同時使用下述適合度函數(shù),所述函數(shù)針對所述預(yù)定的宿主細(xì)胞至少測定單個密碼子適合度和密碼子對適合度之一;(d)根據(jù)預(yù)定的選擇標(biāo)準(zhǔn),在所述至少一條原始編碼序列和所述至少一條新產(chǎn)生的編碼序列中選擇一條或多條選定的編碼序列,所述適合度值越高,被選擇的機會越高;(e)重復(fù)動作b)到d),同時在動作b)到d)中將所述一條或多條選定的編碼序列作為一條或多條原始編碼序列處理,直至滿足預(yù)定的迭代終止標(biāo)準(zhǔn)。2.根據(jù)權(quán)利要求1的方法,其中所述預(yù)定的選擇標(biāo)準(zhǔn)是這樣的,其使得所述一條或多條選定的編碼序列根據(jù)預(yù)定標(biāo)準(zhǔn)具有最優(yōu)的適合度值。3.根據(jù)權(quán)利要求1或2的方法,其中所述方法在動作e)之后包括在所述一條或多條選定的編碼序列中選擇最佳個體編碼序列,其中所述最佳個體編碼序列具有比其它選定的編碼序列更好的適合度值。4.根據(jù)權(quán)利要求1-3中任一項的方法,其中所述預(yù)定的迭代終止標(biāo)準(zhǔn)至少為下述之一測試是否至少一條所述選定的編碼序列具有高于預(yù)定的閾值的最佳適合度值;測試是否所述選定的編碼序列均不具有低于所述預(yù)定的閾值的最佳適合度值;測試是否至少一條所述選定的編碼序列在所述原始編碼序列中有至少30%的對預(yù)定的宿主細(xì)胞而言具有相關(guān)正密碼子對權(quán)重的密碼子對被轉(zhuǎn)化為具有相關(guān)負(fù)權(quán)重的密碼子對;和,測試是否至少一條所述選定的編碼序列在所述原始編碼序列中有至少30%的對預(yù)定的宿主細(xì)胞而言具有高于0的相關(guān)正權(quán)重的密碼子對被轉(zhuǎn)化為具有低于0的相關(guān)權(quán)重的密碼子對。5.根據(jù)權(quán)利要求1-4中任一項的方法,其中所述適合度函數(shù)通過下式來定義單個密碼子對適合度<formula>formulaseeoriginaldocumentpage3</formula>其中g(shù)象征編碼序列,lgl為長度,g("為k-th密碼子,",'(c(fc))是期望的密碼子C^比值,且r/(C(&))為核苷酸編碼序列g(shù)中的實際比值。6.根據(jù)權(quán)利要求1-4中任一項的方法,其中所述適合度函數(shù)通過下式來定義密碼子對適合度<formula>formulaseeoriginaldocumentpage3</formula>其中vv((c(Ac(A+l))是編碼序列g(shù)中密碼子對的權(quán)重,lgl為所述核苷酸編碼序列的長度,且c(Q為所述編碼序列g(shù)中的k-th密碼子。7.根據(jù)權(quán)利要求1-4中任一項的方法,其中所述適合度函數(shù)通過下式定義<formula>formulaseeoriginaldocumentpage3</formula>其中艮(g)=A'2>,(柳-《(柳cp/是大于零的真實值,y一(g)是密碼子對適合度函數(shù),y^(g)是單個密碼子適合度函數(shù),w((c(A),+l))是編碼序列g(shù)中密碼子對的權(quán)重,|g|是所述編碼序列的長度,C(Q是所述密碼子序列中的k-th密碼子,C一(c(")是期望的密碼子c(0比例,《(c(;t))是編碼序列g(shù)中的實際比例。8.根據(jù)權(quán)利要求7的方法,其中c^在104和0.5之間。9.根據(jù)權(quán)利要求6-8中任一項的方法,其中所述密碼子對權(quán)重w來自無終止密碼子的61x61密碼子對矩陣,或包括終止密碼子的61x64密碼子對矩陣,且其中根據(jù)基于計算機的方法,使用至少以下之一作為輸入值來計算所述密碼子對權(quán)重w:由預(yù)定的宿主的至少200個編碼序列組成的核苷酸序列組;由所述預(yù)定的宿主所屬的物種的至少200個編碼序列組成的核苷酸序列組;由預(yù)定的宿主的基因組序列中至少5Q/^的蛋白質(zhì)編碼核苷酸組成的核苷酸序列組;禾口由與預(yù)定的宿主相關(guān)的屬的基因組序列中至少5%的蛋白質(zhì)編碼核苷酸序列組成的核苷酸序列組。10.根據(jù)權(quán)利要求9的方法,其中針對可能的61x64密碼子對中的至少5%、10%、20%、50%和優(yōu)選地100%來測定所述密碼子對權(quán)重w,所述密碼子對包含作為密碼子的終止子信號。11.根據(jù)權(quán)利要求6-8的方法,其中所述密碼子對權(quán)重w來自無終止密碼子的61x61密碼子對矩陣,或包括終止密碼子的61x64密碼子對矩陣,且其中所述密碼子對權(quán)重vv通過下式定義'。一maxW^(c,.,c》),"-'((c,,c》))其中所述組合的預(yù)期值"、"。))通過下式定義"n,。)x(c,.)《(。).s"r((q,。))其中OJ表示全基因組數(shù)據(jù)集中Ck的單個密碼子比例,且"r((A,。))是高表達(dá)的組中對(c,,。)的出現(xiàn),且其中所述高表達(dá)的組是其mRNA能夠以至少每個細(xì)胞20個拷貝的水平被檢測的多個基因。12.根據(jù)前述權(quán)利要求中任一項的方法,其中編碼預(yù)定的氨基酸序列的所述原始編碼核苷酸序列選自(a)編碼所述預(yù)定的氨基酸序列的野生型核苷酸序列(b)預(yù)定的氨基酸序列的逆翻譯,其中預(yù)定的氨基酸序列中氨基酸位置上的密碼子隨機地選自編碼所述氨基酸的同義密碼子;和(c)預(yù)定的氨基酸序列的逆翻譯,其中根據(jù)預(yù)定的宿主或與所述宿主細(xì)胞相關(guān)的物種的單個密碼子偏向性來選擇預(yù)定的氨基酸序列中氨基酸位置上的密碼子。13.根據(jù)權(quán)利要求1-12中任一項的方法,其中所述預(yù)定的宿主細(xì)胞是微生物細(xì)胞,優(yōu)選地是選自以下的屬的微生物Sflcz7/w、Jc""om;;cefo、■EscAen'c/n'a、》e/to附;;c&s、A/erg/〃ws、Pew'c/肌wm、《—veram3;ces、14.根據(jù)權(quán)利要求1-12中任一項的方法,其中所述預(yù)定的宿主細(xì)胞是動物或植物細(xì)胞,優(yōu)選地是選自CHO、BHK、NS0、COS、Vero、PER.C6、HEK-293、DrosophilaS2、SpodopteraSf9和SpodopteraSf21的細(xì)胞系的細(xì)胞。15.計算機,其包含處理器和存儲器,所述處理器被設(shè)置為從所述存儲器讀取和寫入,所述存儲器包含數(shù)據(jù)和指令,所述數(shù)據(jù)和指令被設(shè)置為提供給所述處理器進(jìn)行權(quán)利要求1-14中任一項的方法的能力。16.計算機程序產(chǎn)品,其包含數(shù)據(jù)和指令,并被設(shè)置為可以負(fù)載于計算機的存儲器中,所述計算機也包含處理器,所述處理器被設(shè)置為從所述存儲器讀取和寫入,所述數(shù)據(jù)和指令被設(shè)置為提供給所述處理器進(jìn)行權(quán)利要求1-14中任一項的方法的能力。17.如權(quán)利要求16中要求保護(hù)的計算機程序產(chǎn)品提供的數(shù)據(jù)運載體。18.包含編碼預(yù)定的氨基酸序列的編碼序列的核酸分子,其中所述編碼序列不是天然存在的編碼序列,并且其中所述編碼序列具有針對預(yù)定的宿主細(xì)胞至少低于-0.1、優(yōu)選地低于-0.2、更優(yōu)選地低于-0.3的力^(g)。19.包含編碼預(yù)定的氨基酸序列的編碼序列的核酸分子,其中所述編碼序列不是天然存在的編碼序列,并且其中所述編碼序列具有針對預(yù)定的宿主細(xì)胞至少低于-0.1、優(yōu)選地針對預(yù)定的宿主細(xì)胞低于-0.2的和針對預(yù)定的宿主細(xì)胞至少低于o.i的y^(g,。20.根據(jù)權(quán)利要求18或19的核酸分子,其中所述編碼序列與表達(dá)控制序列可操作地連接,所述表達(dá)控制序列能夠指導(dǎo)所述編碼序列在預(yù)定的宿主細(xì)胞中的表達(dá)。21.包含如權(quán)利要求20中定義的核酸分子的宿主細(xì)胞。22.生產(chǎn)具有預(yù)定的氨基酸序列的多肽的方法,所述方法包括在有助于所述多肽表達(dá)的條件下,培養(yǎng)如權(quán)利要求21中定義的宿主細(xì)胞,以及,任選地回收所述多肽。23.至少生產(chǎn)細(xì)胞內(nèi)和細(xì)胞外代謝產(chǎn)物之一的方法,所述方法包括在有助于所述代謝產(chǎn)物生產(chǎn)的條件下培養(yǎng)如權(quán)利要求21中定義的宿主細(xì)胞,其中具有所述預(yù)定的氨基酸序列的所述多肽優(yōu)選地涉及所述代謝產(chǎn)物的生產(chǎn)。全文摘要本發(fā)明涉及優(yōu)化蛋白質(zhì)編碼序列以在給定的宿主細(xì)胞中表達(dá)的方法。該方法應(yīng)用遺傳算法來優(yōu)化編碼預(yù)定的氨基酸序列的序列的單個密碼子適合度和/或密碼子對適合度。在該算法中,重復(fù)進(jìn)行產(chǎn)生新序列變體和隨后選擇適當(dāng)變體的過程,直到變體的編碼序列達(dá)到單個密碼子適合度和/或密碼子對適合度的最小值。本發(fā)明還涉及包含處理器和存儲器的計算機,所述處理器被設(shè)置為從所述存儲器讀取和寫入,所述存儲器包含數(shù)據(jù)和指令,所述數(shù)據(jù)和指令被設(shè)置為向處理器提供進(jìn)行下述遺傳算法的能力,所述遺傳算法用于優(yōu)化單個密碼子適合度和/或密碼子對適合度。本發(fā)明還涉及包含預(yù)定的氨基酸序列的編碼序列的核酸,所述編碼序列針對本發(fā)明的方法中給定的宿主細(xì)胞針對單個密碼子適合度和/或密碼子對適合度進(jìn)行了優(yōu)化,本發(fā)明還涉及包含這類核酸的宿主細(xì)胞和使用這些宿主細(xì)胞生產(chǎn)多肽和其它發(fā)酵產(chǎn)物的方法。文檔編號C12N15/67GK101490262SQ200780024670公開日2009年7月22日申請日期2007年6月15日優(yōu)先權(quán)日2006年6月29日發(fā)明者約翰尼斯·安德列什·勞博斯,諾埃爾·尼古拉斯·瑪麗亞·伊麗莎白·佩吉·范申請人:帝斯曼知識產(chǎn)權(quán)資產(chǎn)管理有限公司