專利名稱:米多霉素生物合成基因簇的制作方法
技術(shù)領(lǐng)域:
本發(fā)明涉及一種生物技術(shù)領(lǐng)域的基因簇,具體是一種米多霉素生物合成基因簇。
背景技術(shù):
核苷類抗生素是一類通常由微生物次級(jí)代謝產(chǎn)生的,在結(jié)構(gòu)上含有修飾的核苷和 核苷酸的分子的總稱。它們具有廣泛的生物活性,包括抗細(xì)菌、抗真菌、抗線蟲、抗腫瘤、 抗病毒、除草、免疫刺激和免疫抑制等活性(J. Antibiot. (1998)41,1711-39)。米多霉素 (mildiomycin)是由生裂鏈輪絲菌(Sti^ptoverticilliumrimofaciens)產(chǎn)生的一種水溶 性堿性抗生素(J. Antibiot. (1978) 31,511-8),在化學(xué)結(jié)構(gòu)上包括一個(gè)5-羥甲基胞嘧啶 (自然界中僅發(fā)現(xiàn)存在于T偶數(shù)噬菌體的DNA中)、一個(gè)特殊的帶有Y-羥化的胍基和一個(gè) 帶有絲氨酸殘基的4-氨基-吡喃-3-烯部分(J. Am. Chem. Soc. (1978) 100,4895-7),屬于 核苷類抗生素(結(jié)構(gòu)見圖1)。米多霉素并不像其它核苷類和氨基糖苷類抗生素那樣容易 通過(guò)細(xì)胞膜,它作用機(jī)制在于能夠抑制蛋白質(zhì)合成中肽鍵的轉(zhuǎn)移(J.Antibiot. (1985)38, 415-9)。米多霉素對(duì)多數(shù)細(xì)菌和真菌都有抑制活性,尤其對(duì)植物白粉病具有強(qiáng)烈的抑制活 性。白粉病是由子囊菌綱白粉菌目(Erysiphales)白粉菌引起的植物真菌性病害。白粉 菌能侵染650多種單子葉植物和9000多種雙子葉植物。目前用于防治白粉病的主要是三 唑類化學(xué)農(nóng)藥,然而,化學(xué)農(nóng)藥對(duì)人類的可持續(xù)發(fā)展的危害日益引起關(guān)注(農(nóng)藥學(xué)學(xué)報(bào), (2001) 3,12-8)。米多霉素低毒高效的特點(diǎn)成為良好的抑制植物白粉病的農(nóng)藥。目前,市場(chǎng) 上銷售的是日本武田藥業(yè)(TAKADA)生產(chǎn)的米多霉素可濕性粉劑,由于價(jià)格較高,多用于防 治高檔觀賞植物的白粉病原菌。Kishimoto等人研究發(fā)現(xiàn),在亞鐵離子存在的情況下,培養(yǎng)基中添加適當(dāng)?shù)臒o(wú)機(jī) 磷酸鹽能夠提高米多霉素的產(chǎn)量(J.Antibiot. (1996)49,775-80) ;Sawade等通過(guò)用環(huán)絲 氨酸(D-Cycloserine)誘變,并在含氨蝶呤(aminopterin)的平板上進(jìn)行篩選米多霉素 高產(chǎn)菌株,得到了產(chǎn)量是原始菌株2. 6倍的高產(chǎn)突變株(J.Antibiot. (1997)50,206-11)。 通過(guò)原生質(zhì)體融合等手段,徐志南教授得到了一株突變株Sv.rimofaciens ZJU5119,其 米多霉素產(chǎn)量比出發(fā)菌株ZD615提高了 170%,達(dá)到了 1015mg/L(浙江大學(xué)學(xué)報(bào)(工學(xué) 版)(2006)40,1262-6)。Sawada等在Sv. rimofaciens的培養(yǎng)基中添加5-羥甲基胞嘧 啶,5-甲基胞嘧啶(5-methyl-cytosine)、5_ 溴胞嘧啶(5-bromo-cytosine)、5_ 碘胞嘧啶 (5-iodo-cytosine)和5_氟胞嘧啶(5_fluoro-cytosine),能夠分別得到一系列對(duì)應(yīng)的米 多霉素衍生物(J. Ferment. Technol. (1984) 62,537-43)。通過(guò)改變發(fā)酵條件,我們前期還在 生裂鏈輪絲菌ZJU5119分離到去羥甲基米多霉素、去羥基米多霉素和去羥基去羥甲基米多 霉素(上海交通大學(xué)學(xué)報(bào)(2009)43,1-4)。通過(guò)DNA重組技術(shù)可以對(duì)抗生素化學(xué)結(jié)構(gòu)進(jìn)行改造,提高效價(jià)、擴(kuò)大抗菌譜、降低 毒性,同時(shí)對(duì)調(diào)節(jié)基因的遺傳工程改造還可以提高抗生素的產(chǎn)量,這方面的先驅(qū)Hopwood 等利用來(lái)自不同鏈霉菌菌株的結(jié)構(gòu)基因轉(zhuǎn)移產(chǎn)生出新的雜合的抗生素(Nature (1985) 314, 642-4)。由于抗生素生物合成的相關(guān)基因,包括結(jié)構(gòu)基因,抗性基因以及調(diào)節(jié)基因一般都成簇存在(Annu. Rev. Microbiol. (1989) 43,173 06)。我們以生裂鏈輪絲菌ZJU5119產(chǎn)生的米多霉素為目標(biāo)分子,從克隆其生物合成基因簇出發(fā),闡明了合成米多霉素的基因簇。經(jīng)對(duì)現(xiàn)有技術(shù)的文獻(xiàn)檢索發(fā)現(xiàn),尚未見到有關(guān)于米多霉素生物合成基因簇的報(bào)道。
發(fā)明內(nèi)容
本發(fā)明的目的在于克服現(xiàn)有技術(shù)的不足,提供一種米多霉素生物合成基因簇。本 發(fā)明提供了米多霉素生物合成相關(guān)的所有基因和蛋白信息,為生物合成米多霉素和遺傳改 造提供了基礎(chǔ);本發(fā)明的米多霉素生物合成基因簇及其蛋白可廣泛用于農(nóng)業(yè)、工業(yè)和醫(yī)藥 領(lǐng)域。本發(fā)明是通過(guò)以下的技術(shù)方案實(shí)現(xiàn)的本發(fā)明涉及一種米多霉素生物合成基因簇,其序列如SEQID NO 1所示。所述米多霉素生物合成基因簇包含16個(gè)基因結(jié)構(gòu)基因11 個(gè):milA, milB, milC, milD, milE, milG, milH, milj, milM, milN 和 milQ ;其中所述基因milA,位于SEQ ID NO 1的第6125 7126位,所述基因milB,位于SEQ ID NO 1的第7252 7761位,所述基因milC,位于SEQ ID NO 1的第7906 9165位,所述基因milD,位于SEQ ID NO 1的第9185 10369位,所述基因milE,位于 SEQ ID NO :1 的第 10380 11198 位,所述基因milG,位于 SEQ ID NO 1 的第 11627 12631 位,所述基因milH,位于 SEQ ID NO 1 的第 12729 14948 位,所述基因milJ,位于 SEQ ID NO 1 的第 16202 17152 位,所述基因milM,位于 SEQ ID NO 1 的第 19548 20714 位,所述基因milN,位于 SEQ ID NO 1 的第 20710 21483 位,所述基因milQ,位于 SEQ ID NO 1 的第 25168 25935 位;調(diào)節(jié)基因2個(gè)milK和milO ;其中所述基因milK,位于 SEQ ID NO 1 的第 17152 18477 位,所述基因mi 10,位于 SEQ ID NO 1 的第 23289 22222 位;抗性基因1個(gè)milP,所述基因milP,位于SEQ ID NO 1的第23298 24878位;其他基因2個(gè)milF、milI ;其中所述基因milF,位于 SEQ ID NO :1 的第 11194 11664 位,所述基因mill,位于 SEQ ID NO 1 的第 14948 16027 位。所述11個(gè)結(jié)構(gòu)基因編碼的蛋白具體為所述基因milA編碼的蛋白的序列如SEQ ID NO :2所示,該蛋白為CMP羥甲基轉(zhuǎn)移 酶;所述基因milB編碼的蛋白的序列如SEQ ID NO 3所示,該蛋白為CMP/羥甲基水 解酶;所述基因milC編碼的蛋白的序列如SEQID NO :4所示,該蛋白為胞嘧啶/羥甲基胞嘧啶葡萄糖醛酸合成酶;所述基因milD編碼的蛋白的序列如SEQ ID NO :5所示,該蛋白為degT/dnrT/ eryCl/strS氨基轉(zhuǎn)移酶;所述基因milE編碼的蛋白的序列如SEQ ID NO :6所示,該蛋白為氨基糖苷類磷酸轉(zhuǎn)移酶;所述基因milG編碼的蛋白的序列如SEQ ID NO 8所示,該蛋白為RadicalSAM蛋 白;所述基因milH編碼的蛋白的序列如SEQ ID NO 9所示,該蛋白為連接酶;所述基因milj編碼的蛋白的序列如SEQ ID NO :11所示,該蛋白為精氨酸羥化酶;所述基因milM編碼的蛋白的序列如SEQ ID NO :14所示,該蛋白為Asp/Tyr/Aro 氨基轉(zhuǎn)移酶;所述基因milN編碼的蛋白的序列如SEQID NO :15所示,該蛋白為二氫二吡啶甲酸 合酶;所述基因milQ編碼的蛋白的序列如SEQ ID NO :18所示,該蛋白為氨基糖苷磷酸 轉(zhuǎn)移酶。所述2個(gè)調(diào)節(jié)基因編碼的蛋白具體為所述基因milK編碼的蛋白的序列如SEQ ID NO 12所示,該蛋白為 Majorfacilitator superfamily =ICj^S[=| ;所述基因milO編碼的蛋白的序列如SEQ ID NO 16所示,該蛋白為L(zhǎng)uxR家族調(diào)節(jié)蛋白。所述1個(gè)抗性基因編碼的蛋白具體為所述基因milP編碼的蛋白的序列如SEQ ID NO 17所示,該蛋白為 ABCtransporter。與現(xiàn)有技術(shù)相比,本發(fā)明具有如下的有益效果利用本發(fā)明的基因簇可實(shí)現(xiàn)以下 目的包含本發(fā)明所提供的核苷酸序列或至少部分核苷酸序列的克隆DNA可用于從生 裂鏈輪絲菌(Sv. rimofaciens ZJU5119)基因組文庫(kù)中定位更多的文庫(kù)質(zhì)粒。這些文庫(kù)質(zhì) 粒至少包含本發(fā)明中的部分序列,也包含有基因組中以前臨近區(qū)域未克隆的DNA。包含本發(fā)明所提供的核苷酸序列或至少部分核苷酸序列可以被修飾或突變。這 些途徑包括插入、置換或缺失、聚合酶鏈?zhǔn)椒磻?yīng)、錯(cuò)誤介導(dǎo)聚合酶鏈?zhǔn)椒磻?yīng)、位點(diǎn)特異性 突變、不同序列的重新連接、序列的不同部分或其它來(lái)源的同源序列進(jìn)行定向進(jìn)化(DNA Shuffling),或提供紫外線或化學(xué)試劑誘變等。包含本發(fā)明所提供的核苷酸序列或至少部分核苷酸序列的克隆基因可以提供合 適的表達(dá)體系在外源宿主中表達(dá)以得到相應(yīng)的酶或其它更高的生物活性物質(zhì)及產(chǎn)量。這些 外源宿主包括鏈霉菌、大腸桿菌、芽孢桿菌、酵母、植物和動(dòng)物等。包含本發(fā)明所提供的氨基酸序列或至少部分序列可以用來(lái)分離所需要的蛋白質(zhì) 并可用于抗體的制備。包含本發(fā)明所提供的氨基酸序列或至少部分序列的多肽可能在去除或替代某些 氨基酸之后仍有生物活性甚至有新的生物學(xué)活性,或者提高了產(chǎn)量或優(yōu)化了蛋白動(dòng)力學(xué)特征或其它致力于得到的性質(zhì)。包含本發(fā)明所提供的核苷酸序列或至少部分核苷酸序列的基因或基因簇可以在異源宿主中表達(dá)并通過(guò)DNA芯片技術(shù)了解它們?cè)谒拗鞔x鏈中的功能。包含本發(fā)明所提供的核苷酸序列編碼的蛋白可以催化合成羥甲基胞嘧啶葡萄糖醛酸,進(jìn)一步催化合成抗生素-米多霉素。包含本發(fā)明所提供的核苷酸序列或至少部分核苷酸序列的基因或基因簇可以通過(guò)遺傳重組來(lái)構(gòu)建質(zhì)粒以獲得新型生物合成途徑,也可以通過(guò)插入、置換、缺失或失活進(jìn)而 獲得新型生物合成途徑。包含本發(fā)明所提供的核苷酸序列或至少部分核苷酸序列的克隆基因或DNA片段可以通過(guò)中斷米多霉素生物合成的一個(gè)或幾個(gè)步驟而得到新的米多霉素結(jié)構(gòu)類似物或前 體。包含本發(fā)明所提供的核苷酸序列或至少部分核苷酸序列可以用來(lái)提高米多霉素或其衍生物的產(chǎn)量,例如增加正調(diào)節(jié)基因的拷貝數(shù)或增強(qiáng)其表達(dá)以及負(fù)調(diào)節(jié)基因的敲除 等。本發(fā)明提供了在基因工程微生物中提高產(chǎn)量的途徑。總之,本發(fā)明提供了米多霉素生物合成相關(guān)的所有基因和蛋白信息,為生物合成 米多霉素和遺傳改造提供了基礎(chǔ);本發(fā)明的米多霉素生物合成基因簇及其蛋白可廣泛用于 農(nóng)業(yè)、工業(yè)和醫(yī)藥領(lǐng)域。
圖1為米多霉素和去羥甲基米多霉素的化學(xué)結(jié)構(gòu)示意圖;圖2生裂鏈輪絲菌產(chǎn)生米多霉素及去羥甲基米多霉素的LC-MS檢測(cè)結(jié)果圖;圖3為米多霉素的異源表達(dá)示意圖;圖4為米多霉素生物合成基因簇邊界的確定示意圖;圖5為與米多霉素生物合成的相關(guān)基因分析示意圖;圖6為MilC蛋白催化胞嘧啶葡萄糖醛酸和羥甲基胞嘧啶葡萄糖醛酸的合成示意 圖;圖7為MilG蛋白負(fù)責(zé)4’ -酮基羥甲基胞嘧啶葡萄糖醛酸的合成示意圖;圖8為推導(dǎo)的米多霉素生物合成途徑示意圖。
具體實(shí)施例方式下面結(jié)合具體實(shí)施例,進(jìn)一步闡述本發(fā)明。這些實(shí)施例僅用于說(shuō)明本發(fā)明而不用 于限制本發(fā)明的范圍。下列實(shí)施例中未注明具體條件的實(shí)驗(yàn)方法,通常按照常規(guī)條件,例 如 Sambrook 等分子克隆實(shí)驗(yàn)室手冊(cè)(New York Co Id SpringHarbor Laboratory Press, 1989)中所述的條件,或按照制造廠商所建議的條件。以下結(jié)合圖1 8對(duì)本發(fā)明做進(jìn)一步說(shuō)明圖1中米多霉素R = CH2OH ;去羥甲基米多霉素R = H(1)米多霉素的發(fā)酵及LC-MS檢測(cè)將米多霉素的產(chǎn)生菌生裂鏈輪絲菌ZJU5119接種到TSBY(10. 3%蔗糖)培養(yǎng)基 于含有彈簧的三角瓶中30°C培養(yǎng)6天。調(diào)節(jié)發(fā)酵液ph至5,離心取上清進(jìn)行LC-MS檢測(cè),所用儀器為安捷倫公司的Agilent IlOOseries LC/MSD Trap system。采用Agilent TC-C18(5ym,4. 6X25(kim反向柱,流動(dòng)相為IOm M三氯乙酸(Sigma公司)/HPLC級(jí)乙腈 (Merck公司)(92 8,ν/ν),流速為0. 3ml/min。質(zhì)譜檢測(cè)是在離子阱的正離子模式下進(jìn) 行。干燥氣流為81/min,噴霧器壓力為40psi。干燥氣溫為325°C。多級(jí)質(zhì)譜斷裂分析轟擊 電壓在1.0 1.8V之間。以米多霉素標(biāo)準(zhǔn)品(購(gòu)自武田藥業(yè)公司)為對(duì)照,結(jié)果見圖2,可 見生裂鏈輪絲菌ZJU5119不僅能夠產(chǎn)生米多霉素,同時(shí)能夠產(chǎn)生米多霉素的衍生物_去羥 甲基米多霉素。(2)米多霉素的異源表達(dá)利用來(lái)自殺稻瘟菌素生物合成基因簇上的胞嘧啶核苷單磷酸水解酶基因blsM及其同源基因設(shè)計(jì)兼并引物,從生裂鏈輪絲菌ZJU5119的基因組文庫(kù)中篩選出六個(gè)相互重疊 的包含有blsM同源基因的科斯質(zhì)粒。基因敲除milA和milB證明了它們是合成米多霉素 生物合成的必須基因。將包含milA和milB基因的一個(gè)科斯質(zhì)粒14A6提供原生質(zhì)體轉(zhuǎn)化 的方式轉(zhuǎn)入到鏈霉菌模式菌株變鉛青鏈霉菌1326 (Str印tomyces Iividans 1326)中,并將 轉(zhuǎn)化子在YEME培養(yǎng)基中(Difco酵母提取粉3g、Difco蛋白胨5g、0xoid麥芽糖3g、葡萄糖 10g、蔗糖340g、蒸餾水1000ml,滅菌后補(bǔ)加2ml 2 5M MgCl2)發(fā)酵6天。生物測(cè)定將指示菌紅酵母接種到10. 3% YEME液體培養(yǎng)基中,30°C搖床(220rpm/ min)培養(yǎng)約24小時(shí),離心收集菌體,用LB培養(yǎng)基洗滌一次,融化PDA培養(yǎng)基并冷卻至50°C 左右,每20ml培養(yǎng)基中加入100 μ 1上述指示菌,立即混勻后倒入培養(yǎng)皿,使其凝固。收集 20 μ 1發(fā)酵液,利用滅菌后的牛津杯放置在制備好的含紅酵母指示菌的PDA平板上,30°C培 養(yǎng)1 2天后觀察指示菌生長(zhǎng)被抑制的結(jié)果。HPLC-MS分析將收集的發(fā)酵液用草酸調(diào)節(jié)pH 5. 0,12, OOOg離心5分鐘,用陽(yáng)離 子交換小柱(Supelclean LC-SCX,500mg/3ml,Supelco公司)處理上清。小柱先用3ml 甲醇活化,上樣后分別用2ml純水和2ml 0. 5%氨水洗滌,最后收集3%氨水洗脫組份供 測(cè)定分析。在檢測(cè)米多霉素中所用的高壓液相色譜-質(zhì)譜聯(lián)用是安捷倫公司的Agilent IlOOseries LC/MSD Trap system。采用 AgilentTC_C18 (5 μ m,4· 6X 250mm)反向柱,流動(dòng)相 為IOm M三氯乙酸(Sigma公司)/HPLC級(jí)乙腈(Merck公司)(92 8,ν/ν),流速為0. 3ml/ min。質(zhì)譜檢測(cè)是在離子阱的正離子模式下進(jìn)行。干燥氣流為81/min,噴霧器壓力為40psi。 干燥氣溫為325°C。多級(jí)質(zhì)譜斷裂分析轟擊電壓在1. 0 1. 8V之間。如圖3A所示,指示菌為紅酵母,I米多霉素;II和III分別是從兩個(gè)攜帶14A6的變 鉛青鏈霉菌的發(fā)酵提取物;IV從攜帶空載體的變鉛青鏈霉菌的發(fā)酵提取物;科斯質(zhì)粒14A6 的變鉛青鏈霉菌1326的轉(zhuǎn)化子發(fā)酵液和米多霉素標(biāo)準(zhǔn)品一樣產(chǎn)生了抑菌圈,而空載體的 轉(zhuǎn)化子的發(fā)酵液不能夠產(chǎn)生抑菌圈。圖3B為米多霉素和14A6和空載體的變鉛青鏈霉菌轉(zhuǎn) 化子在YEME培養(yǎng)基中的發(fā)酵提取物的HPLC圖譜;顯示了 14A6的變鉛青鏈霉菌轉(zhuǎn)化子發(fā)酵 液提取物中的米多霉素的峰,而空載體的轉(zhuǎn)化子沒(méi)有此峰。這樣通過(guò)生物測(cè)定和HPLC-MS的數(shù)據(jù)證明了科斯質(zhì)粒14A6能夠賦予變鉛青鏈霉 菌1326米多霉素的生產(chǎn)能力,也就表明了 14A6包含了米多霉素生物合成所必須的全部功 能基因。(3)米多霉素生物合成基因簇的邊界確定突變株LL2 (敲除milA)喪失了生產(chǎn)米多霉素的能力,僅能夠產(chǎn)生去羥甲基米多霉素,證明了 milA是必須基因,因此對(duì)其上游基因進(jìn)行了敲除,通過(guò)篩選得到了生裂鏈輪絲 菌突變株LL4 (敲除orf-1的突變株)和LL23 (敲除orf-5至orf_l的突變株),生物測(cè)定 實(shí)驗(yàn)結(jié)果表明它們?nèi)匀荒軌虍a(chǎn)生米多霉素,因此米多霉素生物合成基因簇的上游邊界確定 在orf-Ι和miIA之間。LLl7 (敲除orf+Ι至orf+2的突變株)、LL18 (敲除orf+3至orf+6 的突變株)和LL9((敲除orf+7的突變株)的生測(cè)實(shí)驗(yàn)結(jié)果表明它們?nèi)匀荒軌蛘.a(chǎn)生米 多霉素;(見圖4A LL4 敲除orf-Ι的突變株;LL23 敲除orf-Ι至orf-5的突變株;LL17 敲除orf+Ι至orf+2的突變株;LL18 敲除orf+3至orf+6的突變株;LL9 敲除orf+7的突 變株;WT 野生型;CK 瓊脂塊對(duì)照);而milQ的突變株LLll喪失了生產(chǎn)米多霉素的能力(圖4B,從下至上分別為米多霉素;生裂鏈輪絲菌野生型ZJU5119和生裂鏈輪絲菌LLll :milQ突變株發(fā)酵液的HPLC圖 譜)。因此,把米多霉素生物合成基因簇下游邊界確定在milQ和orf+Ι之間。這樣,米多霉 素的生物合成基因簇就確定在milA至milQ范圍內(nèi)。(4)米多霉素生物合成基因簇的功能分析通過(guò)對(duì)生裂鏈輪絲菌ZJU5119的米多霉素生物合成基因簇上的基因進(jìn)行系統(tǒng)的 敲除和各個(gè)突變株的發(fā)酵產(chǎn)物檢測(cè),這些基因在米多霉素生物合成中的相關(guān)可通過(guò)圖5所 示的內(nèi)容闡述,其中,結(jié)構(gòu)基因、調(diào)節(jié)基因和抗性基因等用不同的顏色表示。+表示基因的 敲除喪失了米多霉素的生產(chǎn)能力;-表示基因敲除對(duì)米多霉素的生產(chǎn)沒(méi)有影響;/表示基因 的敲除降低了米多霉素的產(chǎn)量。包括相關(guān)的結(jié)構(gòu)基因11個(gè)(包括milA、milB、milC、milD、 milE、milG、milH、milj、milM、milN、和milQ),它們的突變株喪失了米多霉素的生產(chǎn)能力; 負(fù)責(zé)米多霉素生物合成的調(diào)節(jié)基因(milO和milK),LuxR家族的調(diào)節(jié)蛋白MilO的突變株樣 喪失了米多霉素的生產(chǎn)能力,而主要易化家族蛋白MilK突變株的米多霉素產(chǎn)量降低;米多 霉素生物合成的抗性基因milP的突變株不能夠生產(chǎn)米多霉素;功能未知的基因milF、milI 和milL,其中敲除milL的突變株仍然能夠正常的產(chǎn)生米多霉素,而敲除milF和mill突變 株喪失了米多霉素的生產(chǎn)能力,這說(shuō)明了 milF和mill和米多霉素生物合成的相關(guān)性,證明 了它們是米多霉素生物合成基因簇的一部份。(5)體外反應(yīng)進(jìn)一步證明MilC的功能MilC是由463個(gè)氨基酸編碼的蛋白質(zhì),序列比對(duì)的結(jié)果顯示它和殺稻瘟菌素生物 合成中的 BlsD(AAP03118)具有一定的同源性(Identities = 144/338(42% ),Positives =180/338(53% ), e-value = 2e_49),BlsD被認(rèn)為是一種UDP-葡萄糖轉(zhuǎn)移酶,負(fù)責(zé)在殺 稻瘟菌素生物合成中cytosylglucuronic acid (CGA)的合成,稱之為CGA合成酶。Guo等在 1991年發(fā)現(xiàn)S :griseochromogenes中存在著催化胞嘧啶和UDP葡萄糖醛酸合成CGA的酶, 并在1994年從菌體中純化了 CGA合成酶,研究發(fā)現(xiàn),UDP-葡萄糖(UDP-glucose)、UDP-半 乳糖(UDP-galactose)和 UDP-半乳糖醛酸(UDP-galacturonic acid)都不是 CGA 的合適 底物,同時(shí)除了胞嘧啶外,腺嘌呤(adenine)、尿嘧啶(uracil)、4_硝基酚(4-nitrophenol) 和α-萘酚(α-naphthol)都不是合適的糖基配體。Cone等在分析殺稻瘟菌素生物合成 基因簇的時(shí)候發(fā)現(xiàn),把包含blsD的6 5kb DNA片段克隆到pIJ702上,并在S =Iividans 中表達(dá),有胞嘧啶存在條件下,能夠產(chǎn)生CGA,表明BlsD就是Guo等人純化的CGA合成酶。 然而,Guo等人純化的CGA合成酶大小是43kD與通過(guò)氨基酸序列計(jì)算出的Bl sD大小 34 5kD 并不一致[J =Bacteriol 1994 (176) 1282-6 ;ChemBioChem 2003(4) :821_9]。通過(guò)MilC做BLASTP僅能找到BlsD,并且在保守域的搜索中也沒(méi)有如何結(jié)果。于是我們采用 了 PSI-BLAST。Position specific iterative BLAST (PSI-BLAST)是位點(diǎn)特異的迭代blast 搜索,主要針對(duì)蛋白序列。第一次blast搜索后,結(jié)果中最相似的序列重新構(gòu)建PSSM(位 點(diǎn)特異性打分矩陣),然后再使用該矩陣進(jìn)行第二輪blast搜索,再調(diào)整矩陣,搜索,如此迭 代。最終高度保守的區(qū)域就會(huì)得到比較高的分值,而不保守的區(qū)域則分?jǐn)?shù)降低,趨近O。這 樣可以提高blast搜索的靈敏度。在檢索結(jié)果中我們發(fā)現(xiàn)了一些相似的N-己酰氨基葡萄 糖轉(zhuǎn)移酶(N-acetylglucosaminyl transferase)序列,這讓我們相信MilC可能與CGA合 成有關(guān)(UDP-葡萄糖醛酸為底物),負(fù)責(zé)糖基的轉(zhuǎn)移。盡管UDP-葡萄糖醛酸基轉(zhuǎn)移酶在哺 乳動(dòng)物的異生物質(zhì)代謝(Xenobiotic metabolism)中非常普遍,在真菌中也有發(fā)現(xiàn),但在細(xì) 菌中卻很少報(bào)道。
敲除milC的突變株喪失了米多霉素的生產(chǎn)能力,證明了它的必要性。為了進(jìn)一步 研究milC的確切功能,進(jìn)行了蛋白的表達(dá)和純化,我們把MilC基因克隆到原核表達(dá)載體 pET28a+上,并在大腸桿菌中進(jìn)行超表達(dá)。然而,幾乎所有的重組蛋白均以包含體形式存在。 經(jīng)過(guò)復(fù)性純化后得到了可溶的重組MilC蛋白,如圖6A所示,其中,MW:蛋白質(zhì)分子量標(biāo)準(zhǔn); 泳道1 透析后重新折疊的可溶性蛋白;泳道2 從可溶性蛋白中純化的MilC。分別以胞嘧啶和羥甲基胞嘧啶為底物,分析它們?cè)贛ilC的催化下與UDP-葡萄糖 醛酸(UDP-glucoronic acid)的作用,結(jié)果見圖6B,(以失活的MilC作為對(duì)照)。在A中, 經(jīng)過(guò)30分鐘的溫浴,做為對(duì)照的加入煮沸失活的MilC的HPLC圖譜顯示仍然只有底物胞 嘧啶和UDP-葡萄糖醛酸(下部),而在上部的HPLC圖譜中可以發(fā)現(xiàn),不僅存在著胞嘧啶和 UDP-葡萄糖醛酸,而且出現(xiàn)了兩個(gè)產(chǎn)物UDP和cytosylglucuronic acid (CGA),它們的保留 時(shí)間,紫外吸收和MS都與標(biāo)準(zhǔn)品一致。在B中,經(jīng)過(guò)30分鐘的溫浴,做為對(duì)照的加入煮沸 失活的MilC的HPLC圖譜顯示仍然只有底物羥甲基胞嘧啶和UDP-葡萄糖醛酸(下部),而 在上部的HPLC圖譜中可以發(fā)現(xiàn),不僅存在著羥甲基胞嘧啶和UDP-葡萄糖醛酸,而且出現(xiàn)了 兩個(gè)產(chǎn)物UDP和HM-cytosylglucuronic acid(HM-CGA)。可見,胞嘧啶和羥甲基胞嘧啶都能 夠在MilC的作用下與UDP-葡萄糖醛酸反應(yīng),分別生成CGA和HM-CGA。在對(duì)糖基供體的實(shí) 驗(yàn)中,MilC不能夠催化UDP-葡萄糖(UDP-glucose)和胞嘧啶或羥甲基胞嘧啶的反應(yīng)。MilC的酶動(dòng)力學(xué)參數(shù)的研究具有非常重要的意義,因?yàn)樗梢员砻鲗?duì)于胞嘧啶和 羥甲基胞嘧啶哪一個(gè)是酶的最適底物。因此,我們?cè)诜磻?yīng)體系中加入過(guò)量的UDP-葡萄糖醛 酸,來(lái)計(jì)算胞嘧啶和羥甲基胞嘧啶對(duì)MilC的米氏常數(shù),結(jié)果見圖6C和表1,可見,MilC對(duì)胞 嘧啶和羥甲基胞嘧啶來(lái)說(shuō)有著近似的Km值,但胞嘧啶的Kcat值卻是羥甲基胞嘧啶的1. 9 倍,可見對(duì)MilC來(lái)說(shuō)胞嘧啶是更適合的底物。表IMilC對(duì)胞嘧啶和羥甲基胞嘧啶的酶動(dòng)力學(xué)參數(shù)比較
胞嘧唳羥甲基胞嘧唳
Kcat (loD1. 070 + 0. 00370. 5623 + 0. 0257
Km(ym)200. 0士 18. 26206. 8士24. 49<formula>formula see original document page 11</formula>MilC在米多霉素生物合成途徑中所催化的反應(yīng)如圖6D所示。(6)檢測(cè)到中間產(chǎn)物進(jìn)一步證明Radical SAM家族蛋白MilG的功能milG編碼一 335氨基酸殘基的蛋白,通過(guò)對(duì)MilG的Pfam數(shù)據(jù)庫(kù)檢索發(fā)現(xiàn)它屬于 一類被稱為 Radical SAM 的超家族蛋白(Radical SAM superfamily,PF04055)。這類蛋白 通過(guò)一個(gè)特殊的鐵硫中心來(lái)還原性的分解S-腺苷甲硫氨酸(SAM)產(chǎn)生自由基。這類蛋白 家族的發(fā)現(xiàn)表明了自由基依賴的催化反應(yīng)對(duì)以前未能解決的一些復(fù)雜的化學(xué)反應(yīng)途徑的 重要性,同時(shí)也反映了這個(gè)家族蛋白既古老而又保守的性質(zhì)。Radical SAM能夠催化多種反 應(yīng),包括特殊的甲基化反應(yīng)、異構(gòu)化反應(yīng)、硫的攙入、成環(huán)反應(yīng)、厭氧氧化反應(yīng)和蛋白自由基 的形成等。它們?cè)贒NA前體物、維生素、輔酶、抗生素和除草素的生物合成和生物降解過(guò)程 中起著重要作用,如賴氨酸2,3變位酶(lysine 2,3-aminomutase)、孢子光合產(chǎn)物裂解酶 (spore photoproduct lyase)、丙||酸甲酸罾角軍(pyruvate formatelyase)、厭氧核H核 Ρ酸還原Bl (anaerobic ribonucleotide reductase)禾口生物素合成Sl (biotin synthase) 等(Chem =Rev 2003(103) :2129_48)。圖7A(MIL 米多霉素;WT 生裂鏈輪絲菌野生型ZJU5119 ;LL8 生裂鏈輪絲菌milG 突變株LL8,羥甲基胞嘧啶葡萄糖醛酸的化學(xué)結(jié)構(gòu)顯示在圖上)顯示敲除milG的生裂鏈輪 絲菌喪失了生產(chǎn)米多霉素的能力,同時(shí),在突變株的發(fā)酵液中,中間產(chǎn)物羥甲基胞嘧啶葡萄 糖醛酸([M+H]/Z 318)得到了大量積累,這表明了 MilG可能是以羥甲基胞嘧啶葡萄糖醛酸 為底物,同時(shí),在整個(gè)米多霉素生物合成基因簇中,milG是唯一一個(gè)可能編碼參與氧化反應(yīng) 的基因,因此推斷MilG負(fù)責(zé)催化羥甲基胞嘧啶葡萄糖醛酸的糖上碳4位羥基氧化成羰基, milG是米多霉素生物合成所必須的基因,在米多霉素生物合成途徑中它所負(fù)責(zé)的功能如圖 7B所示。(7)米多霉素生物合成基因簇中各種基因的作用根據(jù)前面對(duì)蛋白MilA、MilB和MilC的功能研究,已經(jīng)能夠明確了從CMP出發(fā)至合 成出(羥甲基)CGA的步驟。而米多霉素生物合成基因簇邊界的確定、各個(gè)基因的敲除和生 物信息學(xué)的利用提供了米多霉素生物合成基因簇中各個(gè)基因的作用。milG的功能由羥甲基CGA出發(fā)合成米多霉素需要在碳4位上轉(zhuǎn)入氨基,而轉(zhuǎn)氨 基之前的羥基必須被氧化成羰基。MilG是一類Radical SAM,它能夠利用自由基來(lái)催化許 多在化學(xué)上及其難以發(fā)生的反應(yīng)。而MilG的中斷突變株生裂鏈輪絲菌LL8的發(fā)酵產(chǎn)物中 羥甲基CGA的大量積累則支持了我們的判斷。同時(shí),在其它的突變株中未能找到積累的中 間產(chǎn)物的原因可能就是MilG作用后的中間產(chǎn)物的不穩(wěn)定性。通過(guò)和殺稻瘟菌素的生物合 成基因簇的對(duì)比,同樣發(fā)現(xiàn)的一個(gè)Radical SAM-BlsE,考慮到米多霉素和殺稻瘟菌素的結(jié) 構(gòu)上相似性,MilG和BlsE可能就就負(fù)責(zé)(羥甲基)CGA的糖基的碳4為上羥基的氧化,轉(zhuǎn) 化為羰基。milM和milN的功能對(duì)于精氨酸側(cè)鏈和葡萄糖醛酸的縮合反應(yīng),類似于MilN催 化的二氫吡啶甲酸合成,即半醛和酮酸的縮合。而MilM這個(gè)氨基轉(zhuǎn)移酶可以將精氨酸脫去氨基轉(zhuǎn)化為α酮酸,而MilN催化α酮酸和脫羧的己糖的反應(yīng)。milM和milN的基因中斷 突變株都失去了生產(chǎn)米多霉素的能力。milE和milQ的功能關(guān)于糖基部分2位碳與3位碳間雙鍵的形成也是米多霉素 生物合成途徑中非常特殊的反應(yīng),盡管現(xiàn)在沒(méi)有直接的證據(jù)顯示哪個(gè)基因與此相關(guān),但通 過(guò)生物合成基因簇邊界確定和基因中斷實(shí)驗(yàn),我們認(rèn)為很可能是MilE和MilQ這兩個(gè)磷酸 轉(zhuǎn)移酶負(fù)責(zé)這個(gè)雙鍵的形成。milE和milQ的基因中斷突變株都失去了生產(chǎn)米多霉素的能力。milD, mill和milH的功能對(duì)于絲氨酸殘基側(cè)鏈的形成,MilD是個(gè)degT/dnrJ/ eryC/strS類型的氨基轉(zhuǎn)移酶,它可以在被MilG氧化成羰基的4位碳上轉(zhuǎn)入氨基。Mill包 含一磷酸泛酰巰基乙胺結(jié)合位點(diǎn),磷酸泛酰巰基乙胺是一些多酶復(fù)合體中的酰基載體蛋白 的輔基,它作為一個(gè)結(jié)合活化的脂肪酸和氨基酸基團(tuán)的擺臂,因此Mill可能參與的絲氨酸 的活化。MilH作為一個(gè)具有ATP結(jié)合位點(diǎn)的連接酶可能催化絲氨酸殘基和糖上碳4位的氨 基之間的縮合反應(yīng),形成類似肽鍵的酰胺鍵。對(duì)比殺稻瘟菌素的生物合成,MilH的同源蛋 白BlsK可能同樣負(fù)責(zé)氨基酸殘基和糖的結(jié)合,同樣,在殺稻瘟菌素的生物合成以及嘌呤霉 素的生物合成途徑中,催化這一反應(yīng)的都是連接酶,而并不是NRPS(ChemBiOChem2003(4) 821-9)。負(fù)責(zé)米多霉素生物合成的調(diào)節(jié)基因milO和milK,LuxR家族的調(diào)節(jié)蛋白MilO的突 變株樣喪失了米多霉素的生產(chǎn)能力,而主要易化家族蛋白MilK突變株的米多霉素產(chǎn)量降 低。這說(shuō)明了兩個(gè)調(diào)節(jié)基因在米多霉素生物合成途徑中的重要作用。米多霉素生物合成的 抗性基因milP的突變株不能夠生產(chǎn)米多霉素,所用它是合成的必不可少基因;milj做為米 多霉素生物合成基因簇中唯一的氧化還原酶基因負(fù)責(zé)精氨酸的羥化反應(yīng)。敲除milF突變 株喪失了米多霉素的生產(chǎn)能力,這說(shuō)明了 milF和米多霉素生物合成的相關(guān)性,證明了它是 米多霉素生物合成基因簇的一部份。根據(jù)體外酶活實(shí)驗(yàn)、體內(nèi)突變實(shí)驗(yàn)和生物信息學(xué)分析整個(gè)基因簇的的功能,總結(jié) 如圖8所示。實(shí)施例步驟一,米多霉素產(chǎn)生菌生裂鏈輪絲菌ZJU5119總DNA的提取接種鏈輪絲菌至TSBY(10. 3%蔗糖)培養(yǎng)基于含有彈簧的三角瓶中30°C培養(yǎng)48h。 離心收集菌體,重懸于 5ml SET 緩沖液中(75mM NaCl, 25mM EDTA pH8. 0,20mM Tris-HCl ρΗ7· 5)。加入100 μ 1溶菌酶溶液(50mg/ml),置37°C約60分鐘。溶菌后然后加入140 μ 1 蛋白酶K溶液(20mg/ml)混均勻,再加600 μ 1 10% SDS,通過(guò)顛倒混勻,置55°C溫浴2h,期 間偶爾顛倒幾次。再加入2ml 5M NaCl,徹底混勻,冷卻置37°C后,加入5ml氯仿,于室溫輕 輕混勻。20°C、4500g離心15分鐘。轉(zhuǎn)移上清至新管中,加入0. 6倍體積的異丙醇顛倒混 勻,約3分鐘后用玻棒挑取至含70% (ν/ν)乙醇的新管中洗滌,重復(fù)2次,空氣中干燥,溶解 在TE中。步驟二,生裂鏈輪絲菌基因組文庫(kù)的構(gòu)建(1)鏈輪絲菌總DNA的部分酶解和大片段DNA的回收將提取的鏈輪絲菌總DNA用Sau3AI部分酶解,用1 %低熔點(diǎn)瓊脂糖凝膠,在裝有 0. 5 倍 TBE 電泳緩沖液的脈沖場(chǎng)電泳(Pulse Field Gel Electrophoresis,PFGE,Bio-Rad)中分離?;厥占s40kb大小的DNA片段。用熱敏的堿性磷酸酶(APex Heat-Labile Alkaline Phosphatase, EPICENTRE Biotechnologies公司)處理是末端去磷酸化,供載體的連接及 包裝轉(zhuǎn)染用。(2)柯斯質(zhì)粒載體的構(gòu)建和處理為了方便在鏈霉菌中做異源表達(dá),從pOJ446和pSET152質(zhì)粒出發(fā),構(gòu)建了新的可 整合在鏈霉菌染色體的穿梭柯斯質(zhì)粒載體,P0J446被XbaI和XhoI雙酶切后,包含多柯斯 位點(diǎn)的部分和來(lái)自PSET152的用同樣酶切的包含整合酶和attP位點(diǎn)的片段連接。將提取 的pJTU2554質(zhì)粒用單酶切位點(diǎn)的HpaI酶切成線性片段,然后用CIAP (NEB公司)酶處理末 端使其去磷酸化,然后再用BamHI酶切成兩條片段。(3)連接和包裝將處理好的鏈輪絲菌基因組DNA(大小約40kb)和柯斯質(zhì)粒載體按照1 1分子 數(shù)比例用T4連接酶(NEB公司)連接。將在冰上溶解的噬菌體包裝蛋白(MaxPlaxLambda Packing Extracts, EPICENTRE Biotechnologies公司)加入連接產(chǎn)物中,混勻,避免產(chǎn)生 氣泡,短暫離心,在30°C中溫浴90分鐘,再加入另一份包裝蛋白,繼續(xù)溫浴90分鐘后加入噬 菌體稀釋緩沖液(Phage Dilution Buffer, IOOmM NaCl, IOmM MgCl, IOmM Tris-HCl ρΗ8· 3) 至lml,并加入25 μ 1氯仿,4°C保存。(4)轉(zhuǎn)染與保存將大腸桿菌EPI300培養(yǎng)至OD6tltl = 0. 8 1. 0作為感受態(tài)菌,可在4°C保存72h。 將包裝產(chǎn)物與感受態(tài)菌混合,37°C溫浴20分鐘后涂含阿伯拉抗生素的LB平板。37°C培養(yǎng) 過(guò)夜。挑取單克隆至含抗生素的LB培養(yǎng)基96孔板中繼續(xù)培養(yǎng)18小時(shí),加入滅菌的甘油至 終濃度20%,于-70°C中保存。步驟三,基因組文庫(kù)的篩選采用PCR的方法從基因組文庫(kù)中篩選所需要的柯斯質(zhì)粒。為篩選基因組文庫(kù),從 每個(gè)板的96孔孔中取出等量菌液混合,接種培養(yǎng),提取質(zhì)粒為一個(gè)模板進(jìn)行PCR篩選,在得 到的陽(yáng)性平板中,從每排的12個(gè)孔中取出等量菌液混合,接種培養(yǎng),提取質(zhì)粒為一個(gè)模板 進(jìn)行PCR篩選。然后在那些陽(yáng)性的排中進(jìn)行單個(gè)的PCR篩選,直至篩出所有陽(yáng)性克隆。步驟四,雙交換基因置換質(zhì)粒的構(gòu)建(以PJTU412為載體,目的基因被壯觀霉素抗 性基因替換)基因置換載體的構(gòu)建采用PCR-Targeting方法。因?yàn)闃?gòu)建文庫(kù)所用的柯斯質(zhì)粒 (PJTU2554)在鏈霉菌中屬于整合型載體,會(huì)將整個(gè)質(zhì)粒片段整合到染色體上的aatB位置, 所以不適合用于基因置換。因此,首先要構(gòu)建中間載體,本研究采用的是以PJTU412為載 體,它是一種大腸桿菌鏈霉菌的穿梭質(zhì)粒,具有在鏈霉菌中遺傳不穩(wěn)定的特點(diǎn),在非抗性選 擇的條件下,質(zhì)粒極容易丟失。將基因組文庫(kù)中篩選到的負(fù)責(zé)米多霉素生物合成的柯斯質(zhì) 粒酶切,連接到經(jīng)相應(yīng)限制性內(nèi)切酶酶切的PJTU412載體上,構(gòu)建了中間載體。分別將各中 間載體轉(zhuǎn)化至含有PIJ790質(zhì)粒的大腸桿菌BW25113中(E. coli BW25113/pIJ790)中制備 成感受態(tài)細(xì)胞,以通過(guò)PCR擴(kuò)增的pIJ779或pIJ778中含有的aadA基因(帶有與目的替換 基因兩側(cè)同源的尾端)DNA進(jìn)行電轉(zhuǎn)化,得到基因置換質(zhì)粒,并轉(zhuǎn)化到大腸桿菌ET12567中 以消除DNA甲基化影響。步驟五,融合蛋白的表達(dá)
(1)表達(dá)質(zhì)粒構(gòu)建本實(shí)施例均以pET28a+(NOVagen公司)為表達(dá)載體,將待表達(dá)的基因用高保真的KOD酶和帶有限制性酶切位點(diǎn)的引物擴(kuò)增出,酶切后連接至相應(yīng)酶切處理的表達(dá)載體上,構(gòu) 建成融合表達(dá)質(zhì)粒,并轉(zhuǎn)化至大腸桿菌DHlOB中,測(cè)序正確后,將質(zhì)粒轉(zhuǎn)化到含有pLysE質(zhì) 粒大腸桿菌BL21 (DE3) (Novagen公司)。(2)融合蛋白的表達(dá)含有融合表達(dá)質(zhì)粒的大腸桿菌BL21 (DE3) (Novagen公司)接種到含氯霉素和卡那 霉素的LB培養(yǎng)基中37°C培養(yǎng)過(guò)夜。然后將IOml的種子接種到IL的含相應(yīng)抗生素的LB培 養(yǎng)基中培養(yǎng)至OD6tltl = 0. 6,將培養(yǎng)溫度降至28°C,加入IPTG至終濃度為ImM,繼續(xù)培養(yǎng)5小 時(shí)。(3)融合蛋白的純化12,OOOg離心5分鐘收集菌體,然后重懸于25ml的裂解緩沖液(20mM磷酸鈉,0. 5M NaCl,pH7. 4),反復(fù)凍融兩次,放置在冰浴中,用超聲波儀破碎細(xì)胞和打斷DNA分子(60s 10 次,每次間隔60s),然后在4°C,16,OOOg離心45分鐘。對(duì)可溶性融合蛋白,取上清過(guò)鎳金屬鰲合小柱(HisTrap HP column, GEHealthcare公司),用AKTA FPLC(GE Healthcare公司)純化,采用線性方式用洗脫液 (20mM磷酸鈉,0. 5M NaCl, 0. 5M咪唑,pH7. 4)洗脫,收集蛋白,用SDS-PAGE電泳來(lái)鑒定。純 化后的蛋白用脫鹽柱(HisTrap Desalting column,GE Healthcare公司)進(jìn)行緩沖液置換, 置換為50mM的Tris-HCl (pH7. 4)緩沖液,加入甘油至濃度20%,保存于_80°C。對(duì)不可溶蛋 白,則收集細(xì)胞破碎后的沉淀物,按照蛋白復(fù)性試劑盒(Protein Refolding Kit, Novagen 公司)的說(shuō)明書進(jìn)行蛋白復(fù)性,然后在按照上述對(duì)可溶的重組蛋白的純化方法進(jìn)行純化。 蛋白定量采用 Bradford 方法(Bradford Protein Assay Kit, Bio-Rad 公司)。序列表<110>上海交通大學(xué)<120>米多霉素生物合成基因簇<160>18<170>PatentIn version 3.5<210>1<211>43561<212>DNA<213> 生裂鏈輪絲菌 ZJU5119(Str印toverticillium rimofaciens ZJU5119)<400>1ccaagcttgg gctgcaggtc gactctagag atatcggatc accgtcagct tctcgcaggt 60ctcgccctcg gcggcgatga cctgcccggc ctcgaaggcc accggctcga aggcgcccgc 120gagttcggcc agtaaggcct cgtcggcctc gcgcaggaag ggcagttcgc gcaggtcctc 180ggggacgacg cggtgcgcgc cgccctcgct gtagcagctg atgcggtcgt cgccgaggat 240gaacgtccgg cggcggttga cgcggtagac accggactcg acgtcgaccc agggcagcgc 300gcgcagcaga tagcgcggcg tgatcccgcg catctgcggg gtggtcttgg tggtcgtggc 360caactgccgt gcggcatcgg gcgcgagact gagacgatcg ttcatgcggt cctcctcgaa 420
gagcccgcgcggcagcccgc gcgggagcgc agaccgatcg tccggtcggt gggttcgtgt480cacaagacag gggattccga cacggtccgg aaatgatcga ctggcggagt gtgatgccgc540ggccctcgatccggggcaag ggcttcaccg ccgcgcgccc gcttggtccg gaccgcgaga600tgaccggattccacccactg agacccgcct cccgggcgcg tcgcaccgac ggggcagcgc660aaaaccgccctcgccgccgg ttcggtcccc cgaaaaccgg tcccgtcccc gaagcagaac720cgcccctaccgcccccggaa aacagaaccg ccccgatccg gtgatccgga tcggggcggt780tccacaagcg gtagcggtgg gatttgaacc cacggtggag ttgcccccac acacgctttc840gaggcgtgctccttaggccg ctcggacacg ctaccgagag agagcttagc ggacggtggg900ccgtgctctg aaatccgttc cccggtgaca gggtgtggtg ctggtcaccg gtcgcggaag960aagcgggtga gtagctcggc gcagtcgtcg gcgaggacgc cggtgacgac ctcgggacgg1020tggttgagacggcggtcgcg gacgacgtcc cagagggagc ccgccgcgcc ggccttctcg1080tcgacggcgccgtagacgac gcggtccagg cgggagagga cgatcgcgcc cgcgcacatc1140gtgcagggctccagggtgac gacgagcgtg cagcccgaca ggcgccactc gccgacggcc1200cgcgcggcctcgcggatggc gaggacctcc gcgtgaccgg tggggtcgcc ggtggcctcg1260cgctcgttgcggccgcggcc gatgacggag ccgtccctgt ccagcacgac ggcacccacc1320ggcacgtcgccggtggcggg ggccgccgcg gcctccgcca gggcggcccg catgggagcg1380acccaggggtcgcgcaccgg gtcggggacg gggacggctg cgtcgtggac ggcttcgttc1440atggcaccagtgtgagcgca cccgccctgg cggacgtcag cggacggcct ccaggacgtc1500ggtgcagccgagggcctcgg cgatcgagcc gagggcgtcc ccgtccagcg acaggatctc1560cttctcgctcaccccgaggt cctccaggag cctgccgtcg cccagcgggc cgctgggcgc1620gacggaggcgcccgccccgt cgtcgctctc gtcctcctcc gtctcgggtt cgccgtcctc1680ggtgccgtcgaggtcgaggg tgtccagcgc gtcgtcctcg tcgtcgtcgc ggccgacgag1740ctcgtcgacgagcatcgccc cgtacgagct gcggttggcg acgacggcgt tcgagacgta1800gacccgggggtcgtcctcgc cgtccacccg gacgacgccg aaccaggcgt cctcctgctc1860gatgagcaccaggaccgtgt cgtcgtcgta cgaggcctcg cgggcgaggt cggcgatgtc1920ggacagggtctccacgttgt cgagttctgt gtcgctcgct tcccacccgt cttcggtgcg1980cgcgagcattgcggcgaagt acaccgtgac tctcccactg gtcataggcg gtgccgggtc2040ggacggggaccaccccgccc actcggaatc gtggcagaaa cctgggcgtt gcgagaggtc2100ttccgcgctgcgtcgtgcag cagtccgaga gatgtcgctc acgtggggcc cgtgagggcg2160ccgtacgggcgtgacgcggc gcgacacggt gcggcatggt gcgacgcggc gccctgggcg2220tggcgcggacgggtccgccg ggacggtcac cagcggaagg tgcgcatccg catctgctga2280cgcatccgggccgcccgggc ccggcgcggc tggacgcggt cgcgcagctc cttggcctcg2340ttcagctcacggaggaactg ggcccggcgc cgtctgcggt cctcggcgct ctcggggctc2400tcgtccgggtcggcggcctc ccggggccgc cgggggtccg gccgctcgcc gccgggcgtc2460cggcgccggtcctcgcggcg tttcccggtg tctatggagt ccggttcccg ggtgtccgct2520tccgctgtctcccggggcgt ccgggcaccc gacccggatc cccgcctgtc atggccggcc2580atgggcagcaccacctcgtg ccgaggtcct cgcccggcga caggccgggc gtacgtgccc2640actttccccctaagtggtgg tttgatgcca gggctgcgac agaccgtcga agcctcgccg2700aagccccgccgtggcgcctc ggcgggagcc cgggcattaa agctcggtta atgtcgatgt2760
catgcggatccacgtcgtcg accaccccct ggtggctcac aagctcacca cgctgcgcga2820caagcgcaccgattccccca ccttccggcg cctctgcgac gagctggtca ccctgctcgc2880gtacgaggccacgcgcgacg tgcgcaccga gcaggtcagc atcgagtccc cggtgaccgc2940caccaccggcgtccggctct cccacccgcg tccgctggtg gtgccgatcc tccgggccgg3000cctgggcatgctggacggca tggtccggct gctgccgacg gccgaggtcg gcttcctcgg3060catgatccgcaatgaggaga ccctcaaggc ggagacgtac gcgacgcgca tgccggagga3120cctctccggtcgccaggtct acgtcctgga cccgatgctc gccaccggtg gcacgctcgt3180cgccgcgatcaacgagctga tcgcccgtgg cgcggacgac gtcaccgcga tctgcctgct3240ggccgcgccggagggcgtcg aggtcatgga gcgcgagctc gagggcgcgc cggtgaccgt3300cgtcaccgcgtccgtggacg agcggctcaa cgagcacggc tacatcgtcc cgggcctcgg3360cgacgccggcgaccggatgt acggcaccgc gggctgaccc ccgagccggt ttccctctcc3420tccgcccgccgacctcggcg accgcccggc gccgtcctcg gcgaccgccc ggcggccgag3480ctcagcgtctgcccggccgc cggcccaggc gaccgcccgg cgccgtcctc agcgaccgcc3540cgccgccctcagcacttggc ggcgggcgag cccggtgcgg ccgacgccga cggcggggga3600gageccgtggcgccgggcga ggccgacacc gagggcgcgg gccggcccgt gaggccggtc3660agcgccttgtcggcctcctc ctggggcatg agcccctgaa aggccgcccc cagcacgagg3720tcgacgtcctgtccctcgcg ctggtcgctc ttgggctccg ccccggcgag ctgggtgccc3780agcacccggagcgcgccctc cagggactcc tgggacccca gcagtatgcc ggtgccctcc3840accttcttgtcgtagtcggc ctgggcgttg cccaccttgc cgatcctgaa gccgcgcttc3900tccagctcgtccgccgtgat cttggcgagc ccgccgcgcg gcgtggcgtt gtagacgttg3960acggtgatgtcgccgggccg gggcaggtcc ctgggcagct tccgggcggg gacccgggcc4020gcgtcggccccgctcttgca gtcgccgctc cgctttcccg aagccgcggc ccgcgtcgga4080gcgggcccgcccgagaagac gtcgacgagc tggaccgttc cccacccggc cagccccagc4140acgacggccgtggcggtgcc ggcgagcacg atcctgcggc ggttccgggt gcggcgcata4200tgcgggaaccgatggcccgt gatgcggtac ttaccaccca tgccaggagg ggtgagcatg4260ctcatgagcgcagcgtagtg ccgggcggag ccgctgccta ctagatgatc aatgggttgc4320ccggacccctacccaaaagg gccaataacc gcccatgcga ccgtttttcc ggagggcggt4380acgagcccggaacgacggcc cggaaagcgc ccggtccgca tatcggtccg ggaatcgaca4440aaagtgccgaacgaagcgcc gagacggcgg ccggacgggt gacggccccg gcggtcagtc4500cagttcgagcacgcgcgcgt gcagcacctg gcgctgctgc agcgcggccc gtacggcccg4560gtgcagcccgtcctccaggt agagatcgcc ccgccacttc acgacgtggg cgaagaggtc4620cccgtagaacgtcgagtcct ccgcgagcag cgtctcgagg tcgagctgct gcttggtggt4680caccagctggtccaggcgta ccgggcgagg ggcaacatcc gcccactggc gggtgctttc4740ccggccgtggtcggggtacg gccgcccgtt tccgatgcgc ttgaagatca cacggaaagc4800ctaccgggcgagcggctccc ggcgcagcca tggcgcggga gtgcgatggt gacaatcagc4860cgcataccgggagtgatgca tggaactctt gagcgagaat tctttcgtca gagctccgga4920ttccggatggggtccaccgg aatccgctcg cccgggccgt tccccgccgc cgcgaccgcc4980cgctcactcggcgatctcgg tccgctccca ccactcgtag acgggcagcc tgccctcggc5040ggtgtcctgatgccgcgagg tcttcttgaa gtgctcgtag ccgcccttga acgggatctt 5100
cagctcgaccccgggagggg tgatcgtgac gacccgctcc ggaagatcgt ccggaccgcc5160ttcgaggaatgctttgggag cgctgctcat gggggacagt cttccggcgc tccccgccgt5220acgtgcggcgcgacgcgccg caggccgagg gggcgggcgt cacgccttct tgaccgccgt5280cttcttggcggctcgtttca tttcctgctt gtaggcccgg accttgtcca gggactccgg5340
cccggtgatgtcggccaccg agcggtacga ccccgcctcc ccgtaggagc cggccgcctc5400ccgccagccctccggcgtca cccccagtcg cttgcccagc agcgccagga agatctgcgc5460cttctgcttgccgaaccccg gcagcgcctg gagccgctcc agcagttcgc gccccgtcgc5520cgcgccggaccacaccgcgc tcgcgtcccc gtcgtacgtc tcgaccagat ggcggcacag5580ctgctgcacccggcccgcca tggaccccgg atagcggtgc acggccggct tctccgcgca5640cagcgcggcgaacgcctcgg ggtcgtacgc cgcgatctcg tgcgcgtcca gatcgtctcg5700cccgagccgccgggcgatgg tgtacgggcc ggtgaaggcc cactccatcg ggatctgctg5760gtccagcaacatgccgacca gggcagccag cgggctgcgc gagagcaggg cgtcggcgtc5820gggctgctgggcgagccgga ggggacggtc catgggccga tggtccctcc gggagggcgg5880cggcgcatgccgtcccggcc gttcggcggt gtcgggtggt ggtgcccggt cgtgtcgccc5940ggtcgtgtcgctcggtggtg tccgccggtg ccccggggtg tccaccggtg ttctgtggtg6000tcagcaattgcggtctgcag ctagtggtca acgcggcggc attggtccgc ggcccggccc6060tgggcaggatggcggacgcc gatcggcaag tcctgtggta cctcactatt acgggcgggc6120agtgatggaaacccatacgt tcgggacgtt ccaagacgct tatctgagcc agctgcgcga6180catctaccactcaccggaat tccgtaacgc accgcgtgga caggcgagtc gcgaacggat6240cggcgccggattccggctgc tggatcccgt gcagcgccac atatccgtgc cggcccggcg6300cgccaacgtcgtgttcaact tcgccgaggc gctctggtac ctctccggct ccgaccgcct6360cgacttcatccagtactacg cgcccggcat cgcggcctat tcggccgacg ggcggaccct6420gcggggcaccgcctacgggc cccgtatctt ccgccacccg gcgggcgggg tgaaccagtg6480ggagaacgtcgtcaagacgc tgacggacga ccccgacagc aaacgggccg tcatccagat6540cttcgacccccgggaactgg ccgtcgccga caacatcgac gtcgcctgca ccctggccct6600gcaattcctgatccgcgacg ggctgctctg cggcatcggc tacatgcggg cgaacgacgc6660cttccggggcgccgtgagcg acgtcttctc cttcactttc ctgcaggaat tcacggcccg6720ctatctcgggctcggtatcg gcacgtacca ccacgtcgtg gggtccgtgc acatctacga6780cagcgacgcccggtgggcgg agcgggtgct ggacgccgcg acgccggacg gcggcccgcg6840gcccggcttccccgccatgc cggacggcga caactggccg cacgtccgcc gtgtactgga6900gtgggaggaacgcctccgca cgaacgcggc gcgcctctcg gcggacgccc tggacgccct6960ggacctgcccgcctactgga agcacgtcgt ggcgctgttc gaggcccacc gtcaggtccg7020gcacgaggacacgcccgacc gggcgctgct cgccgcgctg cccgaggtct accggcagtc7080gctggccgtcaaatggcccg gccacttcgg ctctccggcc ggctcctgac cccgatcggc7140tcctgacctcggtccgttcc cgaccccggt ccgttcccga ccgttcgccg ggcacgcccg7200gacgacgcgaccccaccaga gacgcgaccc caccagaaag gaacaacccc ggtgaccacc7260acccccaagccccgtaccgc ccccgcggtc ggctcggtct tcctcggcgg gccgttccgc7320cagctcgtcgacccccgcac cggtgtgatg agcagcggtg accagaacgt cttcagccgc7380ctcatcgagcacttcgagag ccgcggcacg acggtctaca acgcccaccg ccgcgaggcc7440
tggggcgccg aattcctgtc gcccgccgag gcgacccggc tcgaccacga cgagatcaag 7500gccgcggacg tcttcgtcgc cttccccggc gtcccggcct cccccggcac ccatgtggag 7560atcggctggg cgagcggcat gggcaagccc atggtcctgc tgctggagcg cgacgaggac7620tacgcgttcc tggtcaccgg tctggagagc caggccaatg tggagatcct ccggttctcc7680ggcaccgagg agategtcga gcggctggac ggggccgtcg cccgggtgct gggccgggcg7740ggcgagccga cggtcatcgg ctgaggcgcg gcccgcatgg acctcttatc ggcggcccgg 7800gcggaccggg cggatcggtc ggaccaggcg gatcggccga ctcggccgga cggagcggat7860cgggcggact ggacggctcg ggcggacgga cccgtgaccc tggcggtggc gggtgccgag 7920ttcggctggg ggagcgcggg gaagctggcc gcgatcgtcg ccgcgttgcg cgaacggcac7980ggcgagcggg tccggttcgc cggcctgggc tccgggctcg ggcgccccgt gctgggcgcc8040ctggacgccc gcgactggac ggacgtgccg gagccgggcg acggcccggc gggcgaggcc8100gcgctggcgg cgctgctgcg cgagcggggc gtggacgcgg cggtcgtcgt cctcgacggc8160ctgctggcgg cccggctgga ggcggtgggg tgtcccgtcg tctacgtcga cagcctgccc8220ttcctgtgga ccgagcacga cttcgtcccg tccggagtcc acacctactg cgcgcagttg 8280tgcccctcgc tgccccggca gagctggccc gtgctgcgcg ggatcgaggc actgcgctgg8340gtggaaccgg tggtgggcac gtacggggcc ggcggcctcg acccggtgcc ggggaaggcc8400gtgctcaacg tcggcggcct gcgctcgccg ttcaccgccg aggacgacga ctcctatgtg8460gagctcgtcc tgggccccgc cctgcgggcg ctgcgggcgg cgggcttcgg acaggtcgtg8520atcagcggca atgtggatcc cggcctggcc cgggtgccgc acgccggtac gcacgggctg8580accgtgacgg cggggcggct ggaccacggc gcgttcatcg aggaactgcg cacggcggag8640ctgctggtga cctcgccggg ccgcaccacc ctgctggagg cggcggcgct cggccagcgg8700gccgtcgtcc tgcccccgca gaacttcagc caggtcatga acgccgcgga cgtcgcggac8760ctggtggacc cggccgtcgt ggtcccctgg ccggccgccg tcctggacct ggccgagctg8820gcccgggtcc gcgaccaggg cgaggagggc gcggtgcggc tgatgtacgc ccgtatcgcc8880gcggcgcgcc gggagccggg gacggtggcc ggcccgctgg ccgacgcgct cggcgccgcc8940gtcgcccacg tccgccggca cgacgtccgc atggggccgt tcgccggcac ggacgggagc9000ggcgcgggaa cgcgaggcgc gggaggcgca agagatacag gaggcgcagg aggagcgcgg9060agtgtggcgg acgccgtcga cgagctgatc gggaagctga cggacggccc ggccgccggg9120aatcgcaggg acggatcacc actggcggcg ccggtccggg cgcgctgagg gagagaagga9180agcgatgcgg caccccaggg aactcaggca ggacacctcg ctcgcgatca acggggggac9240ccccacgttc gccgcgctcc cggaggagga caccgggatc gtggccgagg ccgccgacga9300ggtggcggag ctgatcagga ccaggcgcac cgtccactgg ggcggcggcc cccacacccg9360cgtcctggaa cgggacttcg cggccctcgt cggccgggag cgcgcgttct tccacaactc9420cggcacggcg gccctgcaga ccgccctctt cgccctggag gtcgaggagg gcacccccgt9480cgccctcagc gactccggtt tcgtcgccag tctcaacgcc ctctaccacc tccgggcgcg9540gccggtcttc ctgcccaccc acccggccac gctgcagtgc gtcgacgacg tcgcggagtg9600gaccgccggg accggcgtcc acacggcgct gatcacccac ttcttcggca acgtcgccga9660cgtcgaggcg atctggcgca cctccggggc ccggcatctg gtcgaggacg gcggccaggc9720ccacggcgcg cggctgcggg gccggccggt cggctccttc gggaccgtcg gctccttcgc9780.
gggctcgacg aagaagctgg tcaccgccgg gcagggcggg ctgaacgtccacgacgacga9840gcacctggac tggcggatgc gcacctacgc gcaccacggc aagtccgggaactacgaagg9900gacgttcccc ggctacaact tccggggcgg ggagatggag gcgatcctcgcccacgccgc9960cctgcggcgc ctggacgagc gcgtcgcggc ccgcaaccgc accgccgacacgatgttccg10020gatcttcgac gaggccggga tccgcaccgc gcgcccggcg cccggactcgactgctcgcc10080cgcctggttc gacgtcgcgc tgatcctcga cgaggagtgg ctgggccaccgcgactggct10140ggtcgaggcg atggtcgccg acggcatccc cggctggcac tacccggcgctgatcggcat10200gccctgggtc gagccgtgga tgcgatccaa gggctggtgg ggcgagcgcgaacaggagct10260gctcgcctcg gagaccgcgc tgtggggccg caccctcgtc ctcggcgcccagatgaacgc10320cgtggacgcc gagcggatcg cccacgccgt cgtggcgctg ctcaagggatgacacggcga10380tgacctgcgg cgagatctcc gaggtgcgcc gggtgctgcg ccggctcggcgacggcgggc10440cgcgttccgt ccgggtcagg gagaacggga actgcgcggt gtacgtgggggaccggctcg10500tggtgcgcgt cggccactcc tggccgctgg acgcccgggg cgagctccactgctggagcg10560tcgcccggga tgcgggggtg cccgcccccg agcggatcga cgagggccggctgcccggcg10620ggcgtacgta cgtggcgtac gtgtacgtca tgggcacccc ggccgggacgcccgcctccc10680tcgcggccgc gggcgccgtg ctggcgcggc tgcacacggt gccgggcgagcacttcccgg10740ccgtggcgca caacctgccc cggcgcaggg accgttaccg cacggcggtgcggtgcgcgc10800gggccgccgg gctcgcgccc ggcggcctcg cccaccgctg tctgctgcgcgcggcggacg10860actggcggcg gtcgcgggag gtggccgcgc acggcgactt ccgcacgcccaacctggtgg10920tccggggccg gggggtgagg gccgtcctgg actggagcga cgcccgcgccgccagccccg10980agagcgatct gggccagctc gggcccgggc agctgcgccc gctcctgcggggctatctgg11040accgtgcccg gcgcgccccg gacctggagc tggtggccgg gcacatgctggcccggcatc11100tcgccctgga ggccgccggg gtgttcccgg cgggcacgtc ggcggcgctcgcccggaggt11160tcgggccggg gctgtcccgg gggaggtgga ccgttgcctg accggagtccggcggccgag11220ccgctgatcc tcgacgtcgg cagcgcgggc cagctcgcgg agctggccggcgacctggtc11280gacctggccg ggcccggcgg cgcgaccggc ccctgggtgc tcacctgggcccacggcgcc11340ggggagccgg gcggggagcc gggcgagggg cagaaccggg ggccgaacgggggcacgggc11400gggggcccgg gcgggacggt ggcccggccg ccgggcgcca cggtcgtgcgccacggcggg11460ctggaggtgg tcacggtgcc ccgtccgcca cgcgacctcg gcggtttcctcgacgcgtgc11520tgccgcaccg gcccggtctc gggccacccg gacgtcaccc gcacgatcctcatccttgcc11580gaccccacgg accgggaccg gtccgcttcc cctccggagg cacctcatgacgcaccccgc11640gacggggccc gcgacgggcg gccgtgaccg ctatctcttc atcaggatcctggaggcgtg11700caacgccgac tgcttcatgt gcgagttcgc cctctcccgc gacacctaccgcttcaccct11760cgacgacttc cgcgaactgc tgccgcaggc acaggagtcg ggcgtgcgatacgtccggtt11820caccggcggc gagccgctga tgcacggcga ggtgctcgac ctgatccgcgagggcaccgc11880cgccggcatg cggatgtcgc tcatcaccaa cggcttccgg ctgccgcagatggtcgacaa11940gctggcggag gcggggctgg cgcaggtcat cgtcagcctc gacggctcctccggtgagac12000gcacgacgtc taccggcgca cccccgggat gttcgaccgc gggctggacggactcgtacg12060cgcctcccgg gcgggcatgc tcacccgcgt caacacggtc gtcgggccgcacaacttcgc12120
gcagatgccg gagctgcagc gggtcctgac cgaggcccgc gtggagcagtgggagatgtc12180cgcgctcaag ctggaacggc acatcgccta ccccccggcc gaggaggtgctccacgcctg12240cgaacccgtc ttcctggccg acccgaagcg gtggctggtg cccctgggcaagcgcttcta12300cggggagacc gccgaggaac gggaggcgtt cttcgagcgc ggcacgaccccgagcgcgtc12360acggccgctg tgccatgtga ccgacgacgt gatgtacctg gaccccaagctgggccgcac12420cttcgcctgc agctgtctgc cccaccggga cggcccgggc gccgacatgcgcgacgagcg12480
gggccgcgtc ttcctcaaca gcccttcgtt ccgcgcgcac gccgaggagttcaagcagca12540ggggcccgtg atctgcagcg gctgctcgac cacggcggcc ggctacagcgacgacgtggc12600ccggctcggc tcggtgcccg cctggcacta ctgaccgggg cgccacgccctttgctcgca12660cgccccgtcc gtacacccgt acgcctcctc caccgcccgc acgtcatcctccgcccagga12720agccgaacat gatcctgcgt accgaccacg tggacgcgta tctgtccgccgtgtccgcca12780tcctcgacga gcccggccgc gccggggccg gcgtccccgt gctgtgccggccgggctctc12840cgctggacgt gctggtgacc cgctggtccg ccctgctggg ccacgccgggccgcgtgccc12900gctcggaccg gccgggccgg gccgtcgtcg cggtcggcga cgaccccgtcgtctccgcgg12960cggcacggct gctcgccgtg ctcacgggac ggaccgcgct ggccgtcgccgacgtcaagg13020agctgcccgc cctgtgggag cggcacgacc tcgtctccac cgcgctggtgggcatcggca13080ccgggttcga cgtcccgggc gtcgagccca gcgccttctg gcggctcgacgcgaccgacg13140cgaccctcgg catcctgacc ggccgggacc gggagtccct gacctggttcgtcgccaaga13200gtctgctcac ctccaccgtc cccggcgacg cgcagacgct gctgctgccggaccgcaagc13260cgcgcgagga cacggcgtcg gcgggcgtgg gtgccggggg cgtcgaggtgctgtacgggg13320ccgccgccga ggaggcgctg cccgcgctcg ccgaggacga gcgggtacgggcgctgatcg13380ccgtggaggc ccacggcagg gccgaccacc tgggggtgcg ggacggcatcatctgcggcg13440accggctggc ccatctgggc cggtccagcg agccggaggg catcgggcgggtgccgcagt13500gcgcgttcgg gcacggctgc ttcaagcccg gcgcccgggt ggcgatctcccgtatgccgg13560cgcagtcgct gttcctgcac agctgcacca gttcgcacac cgaggcggacatgtacgaga13620agtcgttcct gctgggcctg gccgccctgg aagggcccgc ccggcacgtgctgggcaccg13680tccgcccgat gcacgacggg ggccacgagg tcggactcgt ctcggcgttgacggcggcgg13740gcgcctccgc cggcgaggtg acccggctgc tgaacgcctc ctaccaccagcaccgcggcg13800agcccgcgcc ctatctgctg ctcggcgacc cggagctgcc gttcgcggacgggccggtgg13860gcgggccgga cgcgggcccg gccgtggagc tggacgcctc cgccggcgcgctgccgctcg13920gcggccggcg cacggcggtc ctgggcagcg gccccggcgt gctggtcgtgggcgacgcga13980ccggggacga ggacggggac ggcccggggc ttcccgcggg cgtgggcgcgctgaccgtcc14040ggcgcggcga ccgtacggac gtcgtggcgt ggagcaccga gggcccgctccccgaagggg14100cgcttccgtt ggtccgccgg gagggcgggg cggtggccgc ggacggcggtgccgaggagc14160tccacgcccg ctgggaccac gtcgaccacg gcatcgcgtc gggcggcgcgctcggcctgc14220tgcccaagga cctcacgggc aggctccagg agctgcggga cctcgccgcagccgtcggca14280ccgccgaccg ggacgcccgc ttcttccccg gccgcctggg cgcggtccggcgcgcagcgg14340cccggctcga ccagcggatc cgcgacgccg accgggcact gatgcacgcgctgctcggcc14400gcaacggcaa gccgttcgac gccgacgaca ggctggagag cgccttcgtgccgctggagt14460
cccagtacgg ccgccaggtg tgctggtgcg gccgggacgc ggtcgtcagccggctgcggc14520cccggctggg cgcccgggaa gtgcgccgga agtacaactg catgcagtgcggggactacg14580cccaggtcgc ggtggacggc gtcgacgtgc gctgggaggc cccggagttcgtggcctcgg14640gaggcgagct ggagcactcc ttccggatcg ccaaccccct tccccacccggtcaccgggg14700tgctcgcgct gagcgtgtcc ccctggtacg gcggcgacgt gtccttccgccccggcatcg14760cgaccttctc ggtggcgccg ggcggcacgt gccgggtggg cgtcacgatgcgcgccgccg14820ggctgaagcc ccaccgctac acggtcgacg cgacggtggt cagccatctgcgcatcaacg14880cctatcgcaa gttcgtgcag gtccgcccgg cgggacccgt cggcccgagcgacgaggacg14940gtgcgctgtg acagcaccta cgaccggacc gaccaccgga cccacgaccgggcccacggc15000cgggcccacg accgggccca cggccgggcc cacggccggg cccacggccggatcggcggt15060cgccgaggag gccgtggcgg agtgggccgc ggcctggctg gagcaggtgcacggggtgcg15120ctacgggccc gacgacgcgc tcttcggctc gctcgactcg ctcgcgctgaccgagctcct15180ggtggcctgc gaggcccatt tcgggctgcg catcgacgag gggttcggctggcaggcgct15240cgcctcggtc cgcagcctcg ccgcccatgt ggccacgggg gtccgcccgcccagcgaccg15300cgtctggttc cggtcgggcc ccggcgcgac cggggacgcg gacctggaccgtacggccgt15360cgtccgggtg gcgctggggc tgccgccggg cgcggccgtg gcccggctctccccgaggga15420gctggcgctg ggcatcggcg ccgccgcagc gccttccgag aagccggccacgaccctgcc15480gcccgagcgc gaacggcttt ccctcgctcg ggagtcgagc acccggcccggctcgctgct15540ggccgtcggc gccacggccg cgcggatccg ggcgttcgcc gggcgcctcgacgccgcgct15600cgcggcggtg ggggccacgc ccgtctggta cccgatcacg accgacagccccgtcggcgc15660cgaccacgtc cagggcatcc cctccgagct gacggcgggc cgcctcgggcacgcgggctg15720tctgcagctg ctcgccgaac tgcccgcgga acgcgacgtc gtgtactcgggcatcgccta15780cgccttccgg gacgagcccg gccgccgctg ggaacccgcc ggccggctggaggcctaccg15840ggtccacgag acggtcgtgc acggcaccga ggagttccgc acggcgatgtggcgacggct15900gtacgagctg gtggaccggg agctgtcggc cctcggcccg ggcggctggcaggaggggcg15960ggacggcttc accccccggc atggaccgca agctcgaatg gctgctggaactcgacgctg16020ggcacggtga gcgtgggggg cacggccatc cgggtggacg gatcgcggtggcctcgctca16080acgaccacgg cgggctcttc gcggcggcgc cggacggctc cggcacgccggacggcggcc16140cgccgggctc cttctgtctg ggcatcggcg tcgaccggct cgcctcgctgggggtgatct16200gatggacgcc gcgcccggca ccgcccgtac ggccgcgggc acgtccgtaccgcccgtact16260ccccgtcgat gccgaacgtc ccgccgcccg gcgcaccctg gccatggaggagggcacacc16320ccggcagtgg gagggcctag ggctgcacgg tgttccggag gccgtggaggcggcgctggg16380cccggccgcc gagctggtcg tcgccgcgcg gggcggcggc cggtccccgctgcccggcct16440cgtcttcgcc cagccctgcc tcggccgctc cgccggcgtg gcccgggacctgcccgtctc16500cgtggtgtgg gagacgggcg tggccctcgc gatcgcccgg gcgctggaccggcccgcggt16560gatcgggctg tgcgtgtacg aggagatcct ccagcagccg caccgggacgccgagttcac16620cgcgctgggc gcggccgtcg cgcggaccgt cgaggcgctg ggccggctgctgggcgtggc16680ggtcaccgcc cgcgtcgaga ccgccgcgcc ccgcgccgcg gaggtgccggcgcgacggct16740
ctacggtctg tacacgccgt tctccgaatc cacctatccg aggggtttccccaacgaggc16800
ggaggtgctg cgcgccttct ccgcgtactg cgggcgctac gaggacgccgcccggcggga16860ggcgtccctg tgggtgacgg aaggcgtgca cctggccaag gcggcgctcctcggcctcgg16920ccccggcgtg cccttcctgg ccaccacccc gctgcccgac cctgcgcaccccggccggct16980tctccaggac gccccggccg ccacccgggt caccctggaa cgccgctcggcgctgcctgc17040cgactggtgg ccggagcagg cgctggaacg cgcgctcggg accggtctgcggcggctgac17100cgaggacttc cacgcgctga tcgaagactt ccacgacccg gcgggagaccgatgagaacg17160ccccggaccg gagccgtcct cggcgggcgc ggccccgccc tgcccttcgtcgcctatatg17220
gcgctctcca acgcccagtt cacccggggc gtgttcgtcc tcttcctgctgcgcgggaac17280atcagcctgg ccgaagtcgg actgctggag agcctgttcc acctcacccgggtgctctgc17340gaggtgcccg ccggcagcgt cgccgaccgc tggggccgcc gtcgtacgatccaggcgggc17400ctgatcctct cggcggcggc gatgccggcg ttcctgctcg gcgggatgttctggtacgcg17460ctggcgttcg tgttccaggg cgcgggctgg gccgcccagc gcggcgccgacaccgcgctg17520ctgtacgagc tgctggaacg gaccggcggg accgatcgct acgcccgcatcctggggcgc17580tcccacgcgg cctcgtacgg gacgctcgcc ctcaccaccg cgctcggcgcgatgctctac17640cagcggcacg tcagcctgcc gttctggctt caggcggccg tcaccctgctggccgtcggg17700gcgatcggcg tgctgccgga gagctcgggg acggcggcgt cgggggcggggtcttcgggg17760tcggggtcct cgggggagcc ggccgaacgg cccatgggtg tctggcggctggcccgcgcg17820ggggcccggc tggtggtcgg ccaccccgtg ctgaggctct tcgtcgccttcgtcgccctg17880gtcgaggccg ggacgacggt ggtgagcatc ttctcccaga gcttcttccggacgctcgga17940tacggcaccg ccaccaccgg gctgatcctc gccctggtca cggccttcagcgcggcggcc18000gcgctgcagt cccaccgcct cgtcgaacgc ggtccggtcc gggtgctgatggccgcttcg18060agcctgtacc tcgtggggct ggccgggatg gcctcgctgc agccgcagctggccgtcgtg18120ggctactacc tcgtcttcct caaccttgac ctgctcgccc cggtgctgagcgccttcttc18180caccgctctg tggacgagga ggtgcgagcg accgccggtt cgtacctcaacctgtcgacc18240agcgtgctca ccttcgccgc cttcccgctc tccggctcgc tgatcgacgccggcggctac18300cgcccgctgc tgatcatcac cgccctggtc agcctgccgc tcctggtcttcctcgtcggc18360gcggcccggc gggtcctctc accgccggaa gagggcgatt ccggggaggacgccggggag18420cgggccgggc ccaaggggcc cggtgcggcg gcacccgaca ccaccacgacgggagtgtga18480gaacaccatg accaccaggg ctgactcccc gtctcccggg tccggcgggcctgtcggacc18540cggcgggtcc ggcggcgacg acggacggcc ggtgatcgcg ctgcgcttcgccccggccga18600cgtcgaagcg gcggccgcgg cggagtacgt cgccgcgcac ctcggcggtttccggtgcct18660gccggagtgt ccccaggagg gcgattccgg cccgggccgg aatccacccgccgccgtgat18720cgtcttcggg cggtccggtg ccgccggagg ggccggtccc gcgggcgtgcccaccgtcct18780ggtcgagggc gcggaaccgg tgcccggcac ggacgcggac gtcgtctgccggcaggcgcc18840cggctggctc accgccgggg aaccgcccgc cccgcccgcc gtacgccccggcggcggccg18900gatccgcacc gtggacgtgg ccgccgtcgc gcccttccgt caggtgcggtcgggcggggg18960tggcgggcgg gctgccctcc tgctcggcgg ggccggtggg cccgacgggtccggtgcgtc19020cgccgggggc gaggctcttc ccggcgccct cgcccggttc atcgccgggcatccggccgc19080cgccggtgac gcgtgggccg tgctcaccga tctcaccggg gagcccctgcgggagctgct19140
cggcctgctg cccccgaccg cccgcacggt gggtgcggcg gactgggcccaggtcctgcg19200ccgcgcggac tcgttggtgg cgacccccac cctgctggcc gccgcccatgcccgtaccgc19260ccggatcccc ctgcacgtac tggacccggc gggaccggcc cagcggcgcgtccaccgggc19320gctggccgcg atcgccggcg ctcccgggga gccgggcggc ctcccggtggtcgggcccga19380cgactggccg cgtgacgacg gccgcgccgg agccctgggc ggggccgcgcagatcgcccg19440gcaggtgcgg cagttgtgcc tcgcgccggc ctgaaccgtc cggcgggtcctgtcacgtcc19500cttgagacgt ccctccgggg cgtcccccac gcaaaggtat ggatggcatgtccgacactc19560tcgcgcacaa ccgtcccctc gacctgaccc agcacgagat agcggccctgcgctccgagc19620acaatctcgc ggacgcgcac acgcaccagt accagtcgcc ggcccagcagctcatcgtgg19680actccctgcc cgccctctgg cacgaggcgg agaagggccg gcaggccgatttcgaacagc19740ggttcatcga ggcgttcttc cggctgcacg gccagcccac ggccatcggcctggaccgca19800cgctgctcac ctacgccgcc tccatctcca cgatgatcgc cgggatgttcctcaagcgcc19860gcgacgcgcg ggtgacgctg gtcgagccct gcttcgacaa cctccccgacctgctcgtca19920atctgggcgt tccgctcacc gccctccccg aggatgccct gcgcgaccccgcgcgcatcc19980accgcgaact gtcacggctg gtgaccaccg aggcgctttt tctcgtcgaccccaacaacc20040cgactggcca tagcctgttc gccgacggca tgcgcggctt cgaggaggtcgtacggttct20100gccgcgagcg cggcacggtc ctcgtcctcg acctgtgctt cgcggccttcgccctcggca20160gtggcggacc cggccgtcac gacgtctacg agctgctgga gaactccggcgtcacctaca20220tcgccatgga ggacaccggc aagacctggc ccgtccagga cgccaaatgcgccctgctca20280ccaccagcgc cgacatctac cccgccgtgt acaacctcca caccagcgtcctgctgaacg20340tctcgccctt catcctgaac accctcaccc gctacatcga ggattcccggcgggacggct20400tcgcctccgt gaccgacgtc ctcgaacgca atcgcaagtc cctgcgggcggccaccgagg20460gcacggtgct ccgcgcccac gagcccgacg tcccggtcag cgtcgcctggttcaccatcg20520acgaccgcgg cccggacgcc acgcagctgc agcgcgacct ctccggccacggcatccacg20580tcctgcccgg tacgtacttc tactggaacg agccgagccg cggcgagcgctacgtccggg20640tggcgctggc gcgtgatccc ggggagttcg acgcctccat ggcccggctgcggacgcttc20700tcgcccgcta tgcgtgagcc cggcctcatc gctccgctgg tcaccccgctgacccccgac20760ggcgcggtct cggaagcgtg cgtacgggcg caggtcgcgc gcgtccgcccgtacgtccgc20820gccctgatgc ccggcatcag ctgcggggag gggtggctcc tggaccgtccgcggtgggag20880cggctggccg ccgccgtcct ggactgccgc gacggcctgc ccgtccacctcggtgtccag20940gcggcggaca cggcggaggt gatccggcgc gcccgctggg ccgtacggcacggggccgac21000gccgtcacgg tcggcccccc gcacggcgcg ggcgcccggc agcgggcggtccacgagcac21060ttcgcgcggg tctgcgcggc ggtcgacacc cccgtctgcg tctaccacgagagcgtcgtc21120agcggcacgc gcatgacgcc cgccacgctg accgccgtct gccggctcgacggcgtccgc21180gccgtgaagg agtcgggccg cgagccgtcc gtcaccaacg acctcatcgccgcggttccc21240gacgtggccg tccaccaggg ctgggaggac ctcttccacg ccacgcccggggccgccggg21300ctgatcgggc cccttgtcct catcgacccg gcgctgtgcg cggagctcgtcgccggggtg21360ggtggggtgc agggggtggt gacggaccgc tgtcgtgagc tggggcttttccgacctgat21420tatgtggccc gcaccaagcg ggagttgtgc cggctgggtg tcctggcccatgccgtgacg21480.
ctgtgacccc ccaccgtacg gaaatgggag tgaccatgaa tccttcgaagacctttctcg 21540ttgtcgggcc gctgcgtgcc gacaccggct ggcagtagag ggcacggccgatcatttctg 21600agttctcgtg gagcgaggcg gtgcggctgg ccggcgtcgc ggcggaggcgctcggggcgg 21660gagatctggc gggcgccgtc ggggcgcttg accgggtggc cgcgctgatccggctggcgg 21720gggagtcggg gggcgggggt gctgcggccg gggtgcgggg ctttcgggcgagtgcggcgc21780
tgatctggga cgccttcgcg gcggctgcgt ccgggccgtg cgacgcgctgcggatcgcgg 21840aggtctgccg ggcgctgcgg gggctggacg aggcggtggc ctcctgggaggagacctgtt21900accggttctt cccggcgctg ggtggggagg agggggcggg ctgtgcggggcctacggctt21960ggtgagcgtc gggggcgggg ccgtgggtcg ggggcggggt ggggtggacgggcccttacg 22020gggctgatct cttcgcggtg cctgccttca ggggagggtg ccccgtatttggcttcagcg 22080gcgaagagct ctttacgccc ctaccagggc ccgttcaccc caccccgccccctcgcgtca 22140ccacctcgcg ggctgcggtt ccgggtgggc gggtgggcga aaccccgcgccgccaggcgc22200gggaaacccc acggcgggtc agccggagag tccacggaac ccccgcaggggaggcggtcc22260gccggaaagg cggaggaggc cgcggtccgc cgccagaacg cccgcctggaagcgactctc22320cgcccccaac tccgccatca tctccgcgat gtgccgccga cacgtccgcgtggacatgtt22380catccgcttg gcgatgacct cgtccttcgc cccggccgcc atcagccgcaagatgcccgc22440gcggatctcg tcggcggcgg ggccgtagcc cacgtgcgtg tagacgaagggtttcgccag 22500gcgccagacc tgttcgatgg tccggtagag gtagtccacg accgccgggtggcggatgac22560gaccgcgccg gggccgtcgg agcggcggtc ggccaggaag gccagcgattggtcgaagat22620gacgacgcga tcgagcaccc cggtcgtggt cctgatctgg gtgccggcctcgtgcatcag 22680cgagaagtgc tgctggaccg ccgggctcga caggacggtg tgcggatacacggtccgtat22740ggcgatgccc cgggtcagca gggacaggtc ccgggggcgg ctgtcggcgagggtgctctc22800cagcagggct tcgggctggg cggtgagcac ctcgtgccgg cagtcgcgcgcggcagcgct22860caacagcccg cggatcgtgc tgatgtcggt gagggactcg atctgcggcgccttccgggc22920gcgcccctcg ttgaccgcgt cgtacgcgtc ctggagcgag gccatggcgctgcgcagccg 22980ctcgtcctcc aggcgctgcc gctggatctc gccctcgcgg accgcgctcagcgcggcggc23040cgccgactgc gggctgatcg cggccagcag tcgcccgcgg tccgtgtgctggatcaggcg 23100cagcgcgacg agcgcgtcga tcgcctccgt gagttcgtgg tccccgtcggcgtccgttcc23160ctcgccgctg tgttccgggg ggccgggcgt gcgcagggcc gtacggggcagggagccgcc23220ggcccgcagg atctccaggt acacggcgcg ggcccggccg gtcagccgggcctccgtcag 23280ggggcacacg ccggtcatcg gccggcctgc ccggcgccct ccaccacggcgggctcccag 23340accggggcgg tcacctcctt gaccgcgccg tccccccgga agtgcaggaagcggtcgaag 23400gaccgggtga accagcggtc gtgggtgacg gccagtacgg tgccgcggaagcccgcgagg 23460ccctgctcca gcgcttcggc gctcgcgagg tccaggttgt ccgtcggctcgtcgagcagc23520agcaaggtcg ccccggagag ctccaggagg aggatgagga agcgggcctgttggcctccg 23580gacagcgtct cgaagcgctg gccgccctgg ccggccagtt cgtagcggccgagggcggcc23640atcgcctcgt ccctgggcag gctgtcgcgc cggacgtcgc ccttccagaggatgtcctcg 23700agggtgcggc ccacgagttc gggccggtcg tgggtctggg agaagtgcccgggcaccacc23760cgggcgccca gtcgggcgct gccggtgtgc gccacgggct ccagcggggtgagcgacggc23820
agctcggggt cgctgccccc gcggcccagc agccgcagga agtgggacttgccggtgccg23880ttcgccccca ggacggcgat gcggtcgccg taccaggcct cgaagccgaagggatccgtc23940agcccgtcca gtcccagccg ctcgcagacg acggcccgtt tgccggtgcggtcgcccgtc24000agccgcatcc ggatgttctg ctcgcgcggc cggggcggcg ggggcggctgcgcctcgaac24060ttcgccagcc gggtgcgggc ggcctgcagc cggctggcca tggcgtcgttgtgcgaggcc24120ttgacctggt agtggcggac gagctccttg agcttggcgt gctcctcgtcccagcgacgg24180cgctcctcct cgaagcgctc gtagcgggag acgcgggcgt cgtgccaggaggcgaacgag24240cccgggtgca tccaggcgga gccgccctcg acggtgacca cgcgcgaggcggtgttggcc24300agcagctcgc ggtcgtgcga gacgtagagc accgtcttcg gggactcggcgaggcgggcc24360tccagccggc gcttgccggg gacgtcgagg aagttgtcgg gctcgtcgaggagcaggacc24420tcgtccggcc cggcgagcag gagcgagagg gcgaacctct tctgctcgccgcccgacagg24480gtgcgcaccg gacgcgaacg ggcctcgtcc cagggtgtgc cgaggatgtcggtgacgacg24540gtgtcgaaga cgacctcctg ttcgtatccc ccggcgtcgc cccaggccaccagggcctcg24600gcgtagcgca gttgtgcctt ctcgccggcg ccgggtacgg ccatcgcggtctccgcccgc24660gccagtgcct cgccggcgcc gcggagcccg gcgggggaga gggagagggccagcccggcg24720agcgtggtct cgtcgctgac catcccgatg aactgccgca tcacgccgagcccgcccgag24780cgggcgacgg ccccgcgcgt cacggggaga tcgcccgcga tcatgcgcaacaacgtggtc24840ttgccggcgc cgttcggtcc gacgagggcg accttcatgc cctcgcccactctgaaggag24900acgtcttcga agagaacgcg cccatctggc agtacatgac ggagacttgtcacatcgaca24960tatcccatgt gcggaatctt gcaacatgca cgggatctct gtcacgcgactttgcggaac25020cagccactct ggtatgtatc cctgggtaag cggcttgatt cgcatgtccgttcgcaaggg25080gtggatgtcc tttccggctc ttgatctcgt gtgcgccagg cggtaattgggcccgttcgg25140ggggcccgtt tcccgtaggg tggacgcgtg atcgaggacg gcggcagcgcgcggggaagt25200gtcaccacgg tgcggcgtgt gggggacacc gtccgccgtc cgcgcggccgctggaccgcc25260aacgtgcacg ccctgctgcg ccatctggcg gacgccgggt tcctccgcgcgccccgggcg25320ctgggcgtcg acgaggacgg gagcgagatc ctgtccttcc tcgacggcgaggtcgcgatg25380cgtccctggc cggccgcgtt gcgggagcgg tccggtgttg tcgagctggccgtgtggctg25440cgcgaatacc acgatgttgt acgggacttc cgtccgccgt gccctgatgagtggttcgtg25500cccggtgtct cctggcgtcc cgggcggctc gtccgccacg gtgacctgggaccctggaac25560tccgtctggc gtggctcccg gctcgtgggc ttcatcgact gggacttcgccgagcccggc25620gatcccctcg acgacctggc ccagctcgcc tggtactgcg tccctctgggcgggcgtgcg25680actggggcgg gcggtgagga gagccgggtg cgggtccggg agcgcctcgcggccgtgtgc25740acggcctacg gggccgagcc cgtgtccgtc ctggacgccc tggccgggctgcaggagcgc25800gaggcccgcc gcatcaccga cctgggcggc cggggcctcg agccgtggacgtccttcctc25860gcccggggcg acgcgacggc gatcgaggag gagcgcgctt ggctgctgacccaccgggag25920gggttgctgg tgggatgagc gggcccggtg ggtggggcgg gggcggggtggggtggacgg25980gcccttgacg gggctgagct cttcgcggtg cctgccttca ggggagggtgccccgtgttt26040tcctccagcg gcgaagagct ctttacgccc ctaccagggc ccgtccgccccaccccgccc26100cctcgcgtca ccaccaccgg tcgctcgtgg ccgagcaatc aggtccgggtgatcggggcg 26160
ggtgggcgaa atccccgcgc cgccaggcgc gggaaccccc caccggcgggcagccggaaa26220gccacggcac cccgccaggg ggtcacgggc gcgtgggggc atccgtcgatcgatggccgc26280ccggcggtca gacgtccgcg tcgcccgcca aaaggtcgac gccgaacagctcccgaaagg26340cccgttcacc cgctgcggtg atcttcaacg cccgcccgga cccgatccgttccacccagt26400gccgctcgag cgccgcacgg caaagcgccg cgccgagcgc accgccgaggtgcccccggc26460gttcggtcca gtccaggcag ctgcggacca cgggtctcga ccccgtgcggaccggcaggg26520ggacgcccag ctcggcgagg cgggtccggc cgtgcccggt gatggagagcccggcgtcgt26580cggtgacgat cccctgtccg agcagggcgt cggagagggc cacccccaggcggccggcga26640ggtggtcgta gcaggtgcga gcccgcgcct cggcgctcgt ccggctcgccccgcgcaggt26700tgccgggggc cgggtcgggc ggtgaccagg aggtcaggtc ctcgatcagggcggccactt26760cgggcccggc cagccggacg tagcggtgcc ggccctggcg ctcctcggcgagcaggccgg26820cggagatcag ccgggagagg tgctcgctgg cggtggaccg cgcgactccggcgtgccggg26880ccagttcgcc cgcggtccag gcccggccgt cgagcagggc cgtgcagaaggcggcccggg26940tccggtcggc gagcagcccg gcgatctgag cgagtgacat gcgcccatcatgcggcggga27000tcggttcggc ggccgccgaa cagttccgct cctaccgtcg gggcatgacccacaccccgc27060ataccttcac ccggtacgcc gccgtcggca ccccggtcgc cctcggcgatggcgtgccga27120tccgggcccg cgcgtccgtg gcggaccaca ccccggtctg gcggcccgcagccgccgcta27180cggccgcagc cgaatccgtc gccaccccgg cggcagccga gcgcctcgccgccccggcct27240cggccgcaac cgaggccacg acccccttcg cggccttcgc cgcgctgcatcggcccggct27300cgccgcttct gcttcccaat gcctgggacc acgcctcggc ggtggctctcgtcgaggcgg27360gcttcctggc gatcgggacg acgagtctcg gtgtggccgc ggcggtcggtcggcccgacg27420ccgtgggggc gacccgggag gagaccctgc ggctggcccg gcggctcgggcgggggcggg27480aacgggggcg gttcctgctg tccgtggacg ctgaaggcgg gttctccgacgatccggcgg27540acgtggccga gctggcccgt gagctggccg gggccggggt ggtcggcatcaacctggagg27600acggccgctc cgacggcacg ctcgcccccg tggagctgca cgtcgcgaagatcgaggcgg27660tgaaggccgc ggtccccggc ctcttcgtca acgcccgtac cgacgtctactggctgggcg27720gcggccagga gggcgaggac aaggacgagg acgagacgtc gtaccggctcgacgcctaca27780gccgggcggg cgccgacggc gtgttcgtac cgggcctgtc cgaccgtacgggcatcgcga27840ggctggtgga gcggctccac gtgccgctga acatcctcca caccccctccggccccaccg27900tcgccgagct cggcgagctg ggcgtggcca gggtcagcct cggttccctgctgttccggg27960tggccctggg cgcggcggtc ggcgcggcgg tggacatccg ggcgggccgtccggcgggag28020cgggcgcgcc gtcctacgac gaggtccagg accggatccg gatcacgggcccgctgggct28080gagctcagcc gacgcgtacg gccaggacgg cgatgtcgtc gttgaggccctggtcgctgt28140ggtcgaggag atcgcggtgc agcctgtcga gcagctcgcg ggggtgggccgggggctgct28200gccgcatcca gtccgccagg gggaagaagc cgccgtcgcg gccgcgggcctcggtgacgc28260cgtcggtgta gaggagcagc tgatcgccgg gggcgatgtc gaaggtgtcgacggtgtagg28320agtcgccgat gaggtccgcg aggctgagca gcggggaggg ggccgtgggtttcagggagc28380ggagttcccc gcggttcagg aggagcggtg gggggtggcc gcagttgaggatccggatgc28440ggccgtcctc gtgcgggatc tcgacgagga gggcggtggc gaagcgttccaccaggtcct28500.
cggggggaaa cgcggcgctg tagcggctgc tgctggcctc cagacgccgtgcgatgccgc28560ccaggtcggg ttcgtcgtgg gcggcctccc ggaaggagtt caccaccgccgcggccgccc 28620ccacggccgg caggcccttg ccccgtacgt cgccgatgag cagccggactccgtacgcgg 28680tgtcggccgc ctcgtagaag tcgccgccga tccgggcctc cgccgcggccgcgaggtaca 28740gcgagtcgat ctcgacgtcc ccgaagcggc gcggcatggg ccccaggaccaccatctgcg 28800ccgcgtcggc gacgagccgg acctggaaga gggtgcgttc ccgctggagccgcacatggc28860ttccgtacgc cgccgccacg gtgacggcga cgatgccccc cgccgtccaccacgtcccca 28920gcccggggaa gacgatgctc aggccgatca tgaggaacag gcagaccgtccccagcagca 28980cggtggggag cacgggccac atggctgcgg cgagcgcggg cgcggcgggcaggagccggc29040tgaaggccat gcgccggggc gtgttgtagg ccagggcggc gatgaccacggtcaggatca 29100ccggggagag gagaacaggg gaccacctgc cgtggagacg gcggggccgcggccggtcat29160gcttgaccat gagacatagc ttatccgtat aaaacggaca tagggctccgggaagtcacc29220cggtcggagg gtctcctagc cctgtgtggg gcgaggggag gggtggtggggcgggtggtc29280gtgggtcggc gggtccagga agcggtcgac cagcagacgg cgcgggcccggccccgccct29340cccgagccgg tccttgggat tggccgccat gcagcggtcc agggacaggcatccgcagcc29400gatgcagtcg tcgagccggt cgcgcagccg tgtcagctgt tcgatgcgggcgtcgaggtc29460gtcccgccag ctccgggaca gggcttccca gtcctcctgg ttcggcgtgcgccgctcggg 29520caggtcggcc agcgcttcct ggacctcgcg cagggagatg ccgacgctctgggcgacccg 29580tacgagcgcc acctgccgga gcgtggcccg ggggtagcgg cgctggttgcccgaggtgcg 29640gcggctgtgg atcaggccca tggactcgta gaaccgcagc gcgctggtcgcgacgccgct29700gcgctccgcg agctcgccga tggtgagttc cttcgcgttg caagggggcctttccatgtc29760tccaccgtat ctgggtcttc aagttaagtt gaggtttttg gggtgggggcggtggggggc29820ggtgcgcggt gcgccgtgcg cggtgagccg cgtacggcgt acggtcggcgctccgcctca 29880gcgctccgcc cgtacggacg tggccgagcc gcgccgcccc gcgcaggcggtgacgcccgc29940cagggccgcg acggcggcga acccgaacag gacgtagtgc gccgagcgcgtcatggcggg 30000tgcgccgagg ttgaccagga cgccggcgag ggcggcgccg aaggagaaggcgaacagccc30060gatcgtgttc agggccgcgg acgccttcgc cgcctcctcg gggtcccgggtgctgcccat30120caccgccgtg gacaggtggg gcatggccat gccgatgccg gagcccgccaccagataggc30180cgcggcccag gccgccaccg tgagcggtcc ggcgtcctcg cgctggaggaggcccgtgag 30240cgtcaggccg gcggccagga cgaacggtcc cgccaggctc aggcggccgagcgtcgcggg 30300ccgggcgccg gagaccgcga cctgcgtcag cgcccacccg acgggcaacgaggtgcccag 30360gaagccggcc gccaccggcg gcagaccgcc cagccgctgg ccgaacagggagatgaacgc30420ctcgacggag gcggcggtcg tgatgaggac cctgacgagg tagagccaccggagcgagga 30480cccggcggcg taggtcgccg ccggcagcac ccgggcgcgt gctccgggccgccgctcgct30540catcacgtag accacgatca gggcgcacgc gacggtgacg gccacggcggtgggcccggg 30600cccggacagc acgccggcca cgctgatcac cgtcgccgtc gcggtgagcagcaccaggga 30660gaccagtggg agcgcgccgg cgtcccccgc ccggcggccc gagggaacggctcgcgacac30720gagtgccacc aagggcagcg ccaggaccgc cacgacggcg aacgccagccgccaggcccc30780gagctgggcg aacagcccgc cgatcgcggg cccgacgaag aagccggccgccatcatcgc30840
cgacaccagg cccgtgcccc gcgcccagag gcgctcgggc agcaccgactggacggtgac30900gtagctcagc cccgccagga gcccggcgcc gaacccctgg aggacccgccccgccagcag30960cacctccatc gtgggcgtga cggccgcgac gaccgtgccg aggacgaacgccccgatgcc31020gatccggtag ccgccccggg ggccgcgcgt ggacaggacg cggctgacgagcatcgcgga31080gatcaccgag gcgatcgcga aggcggtcgc ggtccacgcg tagaggcgttcgccgccgat31140gtcctcgatg gccgtgggca gcaggctggt ggtcacccac gtactggtgccgtccaggag31200catcaccccg gcgagcagca gagcggtggc ccggtgttcg ggcccgaagagctcgcgcca31260gccgccgggg cgtgtgccgg ggaggggtac ggaagtctcg gacgtctcggagatctcgga31320ggtgttgttc ggtatcgcca cccccgcacc aaacaacttc aacagctcttgaagtcaacc31380gcgcgtccgc gcgtccacgc cggtgggctg ccgtacgagt ccaggttgtcggtcatccgc31440tccaggacgg tgacggcgat ccggtactcc tcgcgggtga tgccgaccgtcgacagctcg31500cggaaggcgt ggacgtgctc ggcgacgtcg gcgaggcggg tacggccgtcttcggtcagg31560gccagacggc ccggttccgg gcgggcgacc cagccgtcgg cgatgaccgccccgatggcg31620gcggccaggg cggtggcgtc cgcgttggcg gccaggacgg tcagcacttcggtgtcggtg31680gcctgtggat cgtccttgat gacgttgagg acctgccagt cggtccgggtgatgccgaat31740ccggccagca aggagttcat acggtgggtg agagcgctgt cggtgcggttgagccagtag31800ccgatgggct tcatgttcgc gttcctgagc cgggtcagtg atgccgggccccggccttgc31860ggggcagcag gcgtacgagg gcgcagcagg cgagggtggc cgcggcgaccacggcgaggc31920tgtggacggt cgcggtcgcg gttgcggtcg tggaggcgcc ggggcgtggaagcaagggga31980cggacctgtg cggtgcgggc ccggccggat cagtgccggg tgaaggcggcggcgttctcg32040cgcgcccact gccggaaggt gcgggcgggg cggccgagga gggtacgggtggtgtcggcg32100atggccgcgg gaccgtggtc ggcggcctcc cacaggtcga gcagcgaggtgaccatcggc32160gcgggcatgt agtggcccat ctgctgctcg gcctcggcgc gggtgatgcgctcgacgggg32220atctcgcggc cgagggcgtc cgcgaggacg gcgagttgct cgcggaacgtgagcgactcg32280gggccggtca gggtgaccga gcggccggtg agggaggtgc cggtcagcgcctcgacggcg32340atgtcggcga tgtcctcggg gtggatgggc gcgatgtgcg cgtccgggtaggcgagccgg32400acgggcagcg accggccgat gaagtgggcc cagccgaggg agttgctggcgaaggcgtcc32460gggcgcagga acgtacgggt gagaccggag ccggcgaggg cgcgctcgacctggaggctg32520tggcccgcga gcgggtcggt ctcggcgtcc gggcccagga ccgaggaggacgagagcagg32580acgacgtgct cgacaccggc gccctcggcc gccttgatca gctcatggatgccggacggc32640tgggggtaga ggaagacctg gcggacgccg cggagcgcgg ggccgaaggtctcgggccgg32700tcgaggacga gctcggcggt ctcgacgccg tccgggacgg ccagttcggcggggaccgcg32760ctggcggcgc ggacggtgag gccggcggag tgcagacggt gggtgaccgcctgggcgacc32820ttgccgcggg cgccggtgac gaggatggcc atggagtgct ccattcattgctgatgacat32880atgcatgctt gcatgcacac atttgttggt caacacatgc cttcgtgatgtcatccatgt32940ctgtacgatg aggggcatgg cgaagcgcga acccaagacg gcggacgagctgctggacgc33000cgtgggcccg gccttcggga agctgcggcg ctcctcgctc ctcgaggtcgagaacccgat33060ctcccagaag gacctgagcc gcacgctggt gctcagagtc gtcctggaggcggaacggga33120
agcggagccg gcagcggaac agggcgccgc gcagggcgag gcggacgagcggtccgacgc33180
cggggagatc acggtcggcg cggtcgccca gcacctggga gtggacccgt cggtggccag33240ccgtatggtc tccgactgca tctcggccgg ctatctggtc cgcgcggcct cccagcgcga33300cggccgccgc accgtcctcc acctcagccc cgagggccgt gagctgatgg cccgcttcgg33360ccgccaccag cgctcggcct tcgagtgcat caccgccgac tggaccgagc gggaccgcct33420ggaattcgcc cgcctcatgc tcaagtacgt cgactcccag gacgccctcc gccaccggcc 33480cccggtcaag gacgccgtgc gctgaaccgc ccggcgggcc gtcccccggg cggctcgtcc33540ccgggcggtg cgctcgctgg tgcgacgacg gtgtgaccgt aggcgcgaca gtacgaccgc33600gggcatcacg ctgtgaccgt aggcgcgatg gtgcggccgc aggtgcgacg acggcacgac33660cgcaggcatg acgaaggccc cgaccactcg agtggtcggg gccttcgtgg cagctgacgg33720catacgaacc cgcccgcctc acccctcctt cggcgcaagg ccggtgagcg cggcccccag33780ccggcgcgcg ccctcggcca gctcggcgtg atcggcggtc gcggcgaacc cgatgcgcag33840gtgggccgca ggcggctccg cggcgaagtg gcgactgccg gcgctcacgg cgacgccgcg33900ctgccgggcg gcgccggcga gggcggtgtc gtccacgccc gacggcaggc ggacccacag33960gtgcagtccg cccgtgggca accgggccag ggtcgcatcg ggaagctcct gggcgatcgc34020cgcggccagg acggcgcacc gctcccgcag cgccgtaccg agggaacgga cgtgccggtc34080ccaggacggg gagctgagca cctccagtgc cgcctcctgg agcgggcgcg tgacgaagaa34140gtcgtcgacc aggcgcaccg cccgcatgcg ctccatgacc ggtccgcggg ccaccagcgc34200cccgatccgc aggctcggcg cggcgggctt ggtgagcgag gtgacgtgga cgaccgtgcc34260gtcacggtcg tcggcgatca acggccgtgc cacggcgccg ccgtgtccca ggtgccgtgc34320gaagtcgtcc tcgagcacga aggcgcccga ggcgcgcgcc acgtcgagga tctggcggcg34380tcgttcgggt gccagcacgg cgccggtcgg gttctggaag gtcggctggc agtacagcag34440tcgcgcgccg gtcatcgcga acgcgtcggc cagcatgtcg ggccgcaggc cgtcggcgtc34500gagcggtacc ggaaccggtc gcagccccgc ggcgcgagcc gcggccaggg cctggggata34560ggtcggggac tccaccagga ccgggctgcc gggaccggcg atggcccgga acgcgatcga34620cagggcactc tgcccaccgg cggtgaccag cacgtcctcc ggcgccactc cgccgccgac34680gatccgggcg aacacggtgc gcagcgccgt cagtccgtcg gccggggcac ggtcccaggc34740gtccggacga cgtgccgccc gcgcgagcgc cgcgctcagg gcccgggcgg gctggagcga34800gctgtgcacg tagccgccgt ccatcgcgat cgtcccggcc ggtggcgggc cgagcggctc34860ggcgatcagg tgggtgtcga ccgcgcggtc ggtgagggcg accgtctgcc agtcggtgtc34920catctgccag ttggcgtcca tgtcgccgcc gctaccgccg cccctgccgc cgcccgcgcc34980gcggagtcgg ctgcgctgcg ccacgaacgt cccgctgccc gggcgggtca ccaccgcgcc35040ctcggcggcc agcgcggcga tggtccgcgc cactgtcgcc ggaccgatcc ggtactcctt35100gatcagctcc cggctgctcg gcagccggtc gccgggcgcc agccgggaga ccagcgcgcg35160gaggctatcg gccaactcgg cagaagtgct accgtcgttc atgagagatc acagtagcgc35220ttctggttct gctcgggaag cacttcagct ggacgtcccc gccctgcggg ccgacacccc35280ggggtgccgc cgggtcatcc acttcaacaa cgcgggctgc ggactgatgg cggcgcccgt35340gacggacgcg atggtcggcc atctgaacct cgaggccagg atcggtggtt acgaggcgtc35400ggccgcccgg gccgccgagg tccgcgggtt ccacacggag atcgccgccc tcatcaacac35460cacacccgac aacatcgcct tcgccggcag cgccacccac gcctacgcca acgccctgtc35520
ctcgataccg ttcgaggccg acgacgtcat cctcaccacc cgcgacgacttcgtctccca 35580ccagatcgcc ttcctctccc tgcgcaaacg attcggcgta cgcgtcgtccacgcgcccaa 35640caccccggag ggcgggcccg atgtggaggc gatggccgcg ctgatgcggacccaccgccc35700ccgcctggtc tccgtcaccc acgtcccgac caactcgggc ctcgtctcgcccgtcgccgc35760gatcggccgc cactgccggg agctggacct gctctacctg gtcgacgcctgtcagtcggt35820gggccagctc gtcatcgacg tggaggagat cggctgcgac ctcctcaccgccacctgccg 35880
caagttcctc cgcggcccgc gcggttccgg cttcctctac gtatccgatcgcgtcctgcg 35940cgcgggttac gaaccgctgt tcatcgacat gcacggggcc cgctggaccgagccgggcgg 36000ctacgagccc gtggggacgg cggcccgttt cgaggagtgg gagttcccgtacgccacggt36060gctcggcagc gccgccgcgg tgcgctacgc ccgcgaggtc ggtgtcgaggccatcgagcg 36120gcgcaccccg gcgctcgcgg cccggctccg cgaccggctc gcacccatcccgggggtgcg 36180cgtgctcgac cgcggcccgc gtctcgccgc gctcgtcacc ttcgaggtagcgggctggca 36240gccgcagccg ttcaaggcgg ccatggacgc ccgaagcatc aactcggcgctcagcttccg 36300tgagttcgcg caattcgact tcggggacaa ggacgtcgac tggtgcctccgcctgtcgcc36360gcactactac aacaccgagg aggaagtgga ccacgtcgcg gaggcggtcgcggccctcgc36420cggccagggg cggcgatgac cgacacccgg acacgcgcgg aggagccggccggggaacgg 36480cctgaggagc cgcccgggca acggtctgaa gggccgcccg gggaatcgcacgcggagccg 36540tccgcggagc cacccggggg aagcctctgg cacaaccgcg acttcctcaggttctggttc36600ggcgagacgc tgtcgctcct cggtacccag gtcacgaacc tcgccctgccgctgaccgcg 36660atcaacgcct tccacgccac cgacgagcag gtcggtgtcc tgcggttcctgcagctcgtc36720ccgtacctcg gtctcgccct ggtcttcggg gtgtgggtgg accgggcccgtcggcggcgg 36780atcatgctgg gcgccaacct cgtccggatg gtcctgctga ccctcgtacccgtcctgtac36840tggtcggacg cgctcgacat ggtctccctg ctggtgatcg cctgtgccgtcggcgccgcc36900tcggtgctgt tcgacgtgag ctggatgtcg tacgtgccca cgctcgtgcgcgagcccgag 36960cactacgtcg aagccggcgc caagatgggg atgagctcat cggcggccgatgtggcgggg 37020cccgggctcg cgggcgtgct ggtgggcgcc ctgagtgccc cggtggcgctgatcgccgac37080gcgttctcct atctggtgtc cttgatctcg ctgctgctca tccgcacgcccgagccccgc37140cccgaaccgg cggccgcgcg gaggcatctg ccgaccgaga tccgggacggcctgcgctgg 37200gtgctgaaga acccggtcct gcggtcgctg gccgtgatcg gcttctgctgcaacttctcg 37260atgatcaccg tctggacgat gttcctgctg tacggaacgc gcgacctgcgtctggactcg 37320acgaccctcg gcgggatctt cgccaccgct tccgtgggcg gactgatcggcgccgcgatc37380tcccgcaagg tcatccggcg cttcaggctc ggcctcgtct acctcgtcgcccagtccgcc37440ctcctcgtcg gcccgacgct gatcgtcctg gcgaccggtc ccaggtgggtgatggtgggg 37500atgttcgtcc tctccttctt caccacctac ctcgggctcg gcgtcgccgccgtcgtcatc37560gtcagcctgc gccaggtcag taccccgccg tcgatgatgg gccggatgacggcggtcttc37620cgcaccctgc tcttcggtgg cggcgccctc ggcggcctgt tcgcgggcctgctgtccggc37680cggatcggcg cccgaggggc attgaccgtg gcggcgaccg gatccgccgccgtactgatc37740gcgctcgccc tgtccccggt gacccggcta cggggcctgc cgccggcaacggaggaaccc37800gtcgcggcgg cgaactgagg tcgcggcgac gtactgaggt cgcggcggcgaactgaggtc37860
gcggcggcga actgaggcgg agaacgtcga agggccccac cgcaagcggtggggcccttc37920gagtcgtgcc cggtgaggca ctggcggagg atacgagatt cgaactcgtgaggggttgcc37980cccaacacgc tttccaagcg tgcgccctag gcctctaggc gaatcctccgccgcaaacaa38040tacaagactc cgaggggtgc tcgcgaacac gtgctctcgg gagggcctcggaaggacccg38100ggaggacccc gggagggggt ggagtgggtc gaggggtggc cgagcacccccggcgatccg38160ctaggctggg ggcaagcccc tcacgtggcg ctatctcacc caactcccccagggccggaa38220ggcagcaagg gtaagtgggc tctggcgggt gcgtgagggg cccttgtgttttccggggga38280tcccgggggc tccgggagcc aggagcgggg cggggagcgg gctccgggatctgtgacgga38340gaccacttgt cggtggggcc cgatatcgtc gtaggtgtgt cgtccctcgcgctctaccgc38400cgctaccgcc ccgagtcctt cgccgaggtc atcgggcagg agcatgtcaccaccccgttg38460cagcaggctc tgcggaacaa ccgggtcaac cacgcgtacc tgttcagcggcccgcgcggc38520tgcggcaaga cgaccagcgc gcgcatcctc gcccgctgtc tgaactgcgagcaggggccg38580acgcccactc cctgcggcga gtgccactcg tgcgtggacc tcgcgcgcaacggtcgtgga38640tcgatcgacg tcatcgagat cgacgccgcg tcccacggtg gtgtcgacgacgcccgtgag38700ctgcgcgaaa aggccttctt cggccccgcc gccagccggt acaagatctacatcatcgac38760gaggcccaca tggtcacctc ggcgggcttc aacgccctgc tgaaggtcgtcgaggagccc38820ccggagcatc tgaagttcat cttcgcgacg accgagcccg agaaggtcatcggcacgatc38880cgttcgcgta cgcaccacta tccgttccgg ctcgtcccgc ccggcaccctccgtgactat38940ctgggcgagg tctgcgagca ggagaagatc cccgtcgagg acggcgtcctgccgctggtc39000gtccgggccg gtgccggttc cgtgcgtgac tcgatgtccg tgatggaccagctgctggcc39060ggcgccgccg aggacggtgt gacatacgcc atggcgacgt ccctcctcggctacacggac39120ggctccctgc tggacgccgt ggtcgacgcc ttcgccgccg gcgacggcgccgcggccttc39180gaggtcgtcg accgcgtcat cgagggcggc aacgaccccc gccgcttcgtcgccgacctg39240ctggagcggc tgcgcgacct ggtgatcctc gccgccgtgc cggacgccgccgagaagggc39300ctcatcgacg ccccggtgga tgtcatcgag cgcatgcagg cccaggcgtccgtcttcggc39360gccggcgagc tcagccgcgc cgccgacctc gtcaacgagg gcctgacggagatgcgcggc39420gccacgtccc cgcgcctcca gctggagctc atctgcgcgc gcgtgctgctgcccgccgcc39480ttcgacgacg agcggtccgt acgggcccgc ctcgagcgtc tggagcgcggcgccgcgagt39540gcggccgccg ccttcacgcc cgcgcccccc ggtacggcca tgggctacgtccccggtccg39600gatgcccacg cccacgctcc cgccccggcc gccggtctct ccggcccggcggcggcccgc39660gcggccgtga cgggggcggg gcccgcggca ggtcctgccc ctgttcctgctgcccctgcc39720cctgctgctg ctcctgtcgc tgccgttccc gcgtcgggtc aggccgctcccgctccggcg39780caggctccgg gcgcgcaggc cggtggcgcg tggccggcgg gcgccgcccccgccgccccg39840gcccccgccg cctccgcgcc cgcatcgcag cccggcgcgt ggcctgcggcctccggcgcc39900cccactcctg ccccggccgc cccgcaggcg ggtccccagc ccggcgcctggccgaccgcc39960gcggcgcccg gctcgggccc cgcgcaggct ccggcccccg ccgcgagcgccccgcaggcc40020ggttcctggc ccacgggcgc cgcccccgcc gccccggctc ccgccgcgcccacgggcgcg40080cccatgggcg ccccgcaggg cgacgcaggc caggcgcgcc agctctggccgaacatcctg40140gaggccgtga agaaccgccg ccgcttcacc tggatcctgc tcagccagaacgcccaggtc40200.
cctccgcgcc ctgcagtcgc tcgtcgccca cgtccgcctc gaccagcccc tggacgagct 42600cgacggcctc gacaccgacg tcatcggccc ggaggaccag ctggccgagg aagccggccg 42660cgtcatggcc gccgagatcg gcggcccgat gcggatgcga gcacgaggcg cctcccaggg 42720cgcctgaggc ggcctgccgg gatctgcgcc cctaggggcg cagcccgccg agttgtcgtc 42780tttcgggggc gccgcccgcc ggggtgccgg ccggggtggc cgtgccacgg ggccgccgcc 42840ctgcgggatc gtcgcctccg ggaaggacgt ccctagaggc cgccgccctc cgagttctac 42900gcccggcagt tggcgaccgt gccgcccgcg cggttgtcgg tacggcagat gttgtcgcca 42960
cgcccgccgt cgacggtccc gtattcgccg acgacgatcg cggtgccgcc cggtccctgg 43020atgacgtcat cgtcgtcgtc cccggagacc gtgacgaatg cggcctgccc gatggagggc 43080acctcgatga ggtcggcgtc ggcgccccca cggacgtgga cggggcgtgc cttgtcgtcg 43140acgccgatcg ccccgacgcg gatgacgtcg tccccttcgt cgccgtcgat ggtgacgggg 43200ttggggagac gggacgacag gcgggaggcc ctttccacgg tgatcttgtc cgcgccgggg 43260ccgccacgga gcggtccgcc gagttccccg accgtgatgg tgtcgtcgcc gagtccgccg 43320tcgaggctgc tcgacttgtc ccaggcgggt acgacgtagg cggcgatgcg gagggtgtcg 43380tccccgtcgc cgccctcgat cctggtgctg aacacggcgc cggtggtgat cacgtcgttc 43440ccggcggcgc cgtcgatgag cccccggtac gacacggcgc ccgtggtgat gatgtcgttg 43500cccgcgccgc cgtagatggt gccctggacg ctccacgcgt ccttgtcggt cacggtgatc 43560c43561<210>2<211>334<212>PRT〈213〉生裂鏈輪絲菌 ZJU5119(Str印toverticillium rimofaciens ZJU5119)<400>2Met Glu Thr His Thr Phe Gly Thr Phe Gln Asp Ala Tyr Leu Ser Gln151015Leu Arg Asp lie Tyr His Ser Pro Glu Phe Arg Asn Ala Pro Arg Gly202530Gln Ala Ser Arg Glu Arg lie Gly Ala Gly Phe Arg Leu Leu Asp Pro354045Val Gln Arg His lie Ser Val Pro Ala Arg Arg Ala Asn Val Val Phe505560Asn Phe Ala Glu Ala Leu Trp Tyr Leu Ser Gly Ser Asp Arg Leu Asp65707580Phe lie Gln Tyr Tyr Ala Pro Gly lie Ala Ala Tyr Ser Ala Asp Gly859095Arg Thr Leu Arg Gly Thr Ala Tyr Gly Pro Arg lie Phe Arg His Pro100105110Ala Gly Gly Val Asn Gln Trp Glu Asn Val Val Lys Thr Leu Thr Asp115120125
AspProAspSerLysArgAlaVallieGinliePheAspProArgGlu
130135140
LeuAlaValAlaAspAsnlieAspValAlaCysThrLeuAlaLeuGin
145150155160
PheLeulieArgAspGlyLeuLeuCysGlylieGlyTyrMetArgAla
165170175
AsnAspAlaPheArgGlyAlaValSerAspValPheSerPheThrPhe
180185190
LeuGinGluPheThrAlaArgTyrLeuGlyLeuGlylieGlyThrTyr
195200205
HisHisValValGlySerValHislieTyrAspSerAspAlaArgTrp
210215220
AlaGluArgValLeuAspAlaAlaThrProAspGlyGlyProArgPro
225230235240
GlyPheProAlaMetProAspGlyAspAsnTrpProHisValArgArg
245250255
ValLeuGluTrpGluGluArgLeuArgThrAsnAlaAlaArgLeuSer
260265270
AlaAspAlaLeuAspAlaLeuAspLeuProAlaTyrTrpLysHisVal
275280285
ValAlaLeuPheGluAlaHisArgGinValArgHisGluAspThrPro
290295300
AspArgAlaLeuLeuAlaAlaLeuProGluValTyrArgGinSerLeu
305310315320
AlaValLysTrpProGlyHisPheGlySerProAlaGlySer
325330
<210>3
<211>170
<212>PRT
<213> 生裂鏈輪絲菌 ZJU5119(Str印toverticillium rimofaciens ZJU51
<400>3
ValThrThrThrProLysProArgThrAlaProAlaValGlySerVal
151015
PheLeuGlyGlyProPheArgGinLeuValAspProArgThrGlyVal
202530
MetSerSerGlyAspGinAsnValPheSerArgLeulieGluHisPhe
354045
GluSerArgGlyThrThrValTyrAsnAlaHisArgArgGluAlaTrp
505560
Gly Ala Glu Phe Leu Ser Pro Ala Glu Ala Thr Arg Leu Asp His Asp65707580Glu lie Lys Ala Ala Asp Val Phe Val Ala Phe Pro Gly Val Pro Ala859095Ser Pro Gly Thr His Val Glu lie Gly Trp Ala Ser Gly Met Gly Lys100105110Pro Met Val Leu Leu Leu Glu Arg Asp Glu Asp Tyr Ala Phe Leu Val115120125Thr Gly Leu Glu Ser Gln Ala Asn Val Glu lie Leu Arg Phe Ser Gly130135140Thr Glu Glu lie Val Glu Arg Leu Asp Gly Ala Val Ala Arg Val Leu145150155160Gly Arg Ala Gly Glu Pro Thr Val lie Gly165170<210>4<211>420<212>PRT<213> 生裂鏈輪絲菌 ZJU5119(Str印toverticillium rimofaciens ZJU5119)<400>4Val Ala Gly Ala Glu Phe Gly Trp Gly Ser Ala Gly Lys Leu Ala Ala151015lie Val Ala Ala Leu Arg Glu Arg His Gly Glu Arg Val Arg Phe Ala202530Gly Leu Gly Ser Gly Leu Gly Arg Pro Val Leu Gly Ala Leu Asp Ala354045Arg Asp Trp Thr Asp Val Pro Glu Pro Gly Asp Gly Pro Ala Gly Glu505560Ala Ala Leu Ala Ala Leu Leu Arg Glu Arg Gly Val Asp Ala Ala Val65707580Val Val Leu Asp Gly Leu Leu Ala Ala Arg Leu Glu Ala Val Gly Cys859095Pro Val Val Tyr Val Asp Ser Leu Pro Phe Leu Trp Thr Glu His Asp100105110Phe Val Pro Ser Gly Val His Thr Tyr Cys Ala Gln Leu Cys Pro Ser115120125Leu Pro Arg Gln Ser Trp Pro Val Leu Arg Gly lie Glu Ala Leu Arg130135140Trp Val Glu Pro Val Val Gly Thr Tyr Gly Ala Gly Gly Leu Asp Pro145150155160
ValProGlyLysAlaValLeuAsnValGlyGlyLeuArgSerProPhe
165170175
ThrAlaGluAspAspAspSerTyrValGluLeuValLeuGlyProAla
180185190
LeuArgAlaLeuArgAlaAlaGlyPheGlyGinValVallieSerGly
195200205
AsnValAspProGlyLeuAlaArgValProHisAlaGlyThrHisGly
210215220
LeuThrValThrAlaGlyArgLeuAspHisGlyAlaPhelieGluGlu
225230235240
LeuArgThrAlaGluLeuLeuValThrSerProGlyArgThrThrLeu
245250255
LeuGluAlaAlaAlaLeuGlyGinArgAlaValValLeuProProGin
260265270
AsnPheSerGinValMetAsnAlaAlaAspValAlaAspLeuValAsp
275280285
ProAlaValValValProTrpProAlaAlaValLeuAspLeuAlaGlu
290295300
LeuAlaArgValArgAspGinGlyGluGluGlyAlaValArgLeuMet
305310315320
TyrAlaArglieAlaAlaAlaArgArgGluProGlyThrValAlaGly
325330335
ProLeuAlaAspAlaLeuGlyAlaAlaValAlaHisValArgArgHis
340345350
AspValArgMetGlyProPheAlaGlyThrAspGlySerGlyAlaGly
355360365
ThrArgGlyAlaGlyGlyAlaArgAspThrGlyGlyAlaGlyGlyAla
370375380
ArgSerValAlaAspAlaValAspGluLeulieGlyLysLeuThrAsp
385390395400
GlyProAlaAlaGlyAsnArgArgAspGlySerProLeuAlaAlaPro
405410415
ValArgAlaArg
420
<210>5
<211>395
<212>PRT
<213> 生裂鏈輪絲菌 ZJU5119(Str印toverticillium rimofaciens ZJU51
<400>5[1000]Met Arg His Pro Arg Glu Leu Arg Gln Asp Thr Ser Leu Ala lie Asn151015Gly Gly Thr Pro Thr Phe Ala Ala Leu Pro Glu Glu Asp Thr Gly lie202530Val Ala Glu Ala Ala Asp Glu Val Ala Glu Leu lie Arg Thr Arg Arg354045Thr Val His Trp Gly Gly Gly Pro His Thr Arg Val Leu Glu Arg Asp505560Phe Ala Ala Leu Val Gly Arg Glu Arg Ala Phe Phe His Asn Ser Gly65707580Thr Ala Ala Leu Gln Thr Ala Leu Phe Ala Leu Glu Val Glu Glu Gly859095Thr Pro Val Ala Leu Ser Asp Ser Gly Phe Val Ala Ser Leu Asn Ala100105110Leu Tyr His Leu Arg Ala Arg Pro Val Phe Leu Pro Thr His Pro Ala115120125Thr Leu Gln Cys Val Asp Asp Val Ala Glu Trp Thr Ala Gly Thr Gly130135140Val His Thr Ala Leu lie Thr His Phe Phe Gly Asn Val Ala Asp Val145150155160Glu Ala lie Trp Arg Thr Ser Gly Ala Arg His Leu Val Glu Asp Gly165170175Gly Gln Ala His Gly Ala Arg Leu Arg Gly Arg Pro Val Gly Ser Phe180185190Gly Thr Val Gly Ser Phe Ala Gly Ser Thr Lys Lys Leu Val Thr Ala195200205Gly Gln Gly Gly Leu Asn Val His Asp Asp Glu His Leu Asp Trp Arg210215220Met Arg Thr Tyr Ala His His Gly Lys Ser Gly Asn Tyr Glu Gly Thr225230235240Phe Pro Gly Tyr Asn Phe Arg Gly Gly Glu Met Glu Ala lie Leu Ala245250255His Ala Ala Leu Arg Arg Leu Asp Glu Arg Val Ala Ala Arg Asn Arg260265270Thr Ala Asp Thr Met Phe Arg lie Phe Asp Glu Ala Gly lie Arg Thr275280285Ala Arg Pro Ala Pro Gly Leu Asp Cys Ser Pro Ala Trp Phe Asp Val290295300Ala Leu lie Leu Asp Glu Glu Trp Leu Gly His Arg Asp Trp Leu Val.1039]3053103153201040]GluAlaMetValAlaAspGlylieProGlyTrpHisTyrProAlaLeu1041]3253303351042]lieGlyMetProTrpValGluProTrpMetArgSerLysGlyTrpTrp1043]3403453501044]GlyGluArgGluGinGluLeuLeuAlaSerGluThrAlaLeuTrpGly1045]3553603651046]ArgThrLeuValLeuGlyAlaGinMetAsnAlaValAspAlaGluArg1047]3703753801048]lieAlaHisAlaValValAlaLeuLeuLysGly1049]3853903951050]<210>61051]<211>2731052]<212>PRT1053]<213> 生裂鏈輪絲菌 ZJU5119(Str印toverticillium rimofaciens ZJU511054]<400>61055]MetThrCysGlyGlulieSerGluValArgArgValLeuArgArgLeu1056]1510151057]GlyAspGlyGlyProArgSerValArgValArgGluAsnGlyAsnCys1058]2025301059]AlaValTyrValGlyAspArgLeuValValArgValGlyHisSerTrp1060]3540451061]ProLeuAspAlaArgGlyGluLeuHisCysTrpSerValAlaArgAsp1062]5055601063]AlaGlyValProAlaProGluArglieAspGluGlyArgLeuProGly1064]657075801065]GlyArgThrTyrValAlaTyrValTyrValMetGlyThrProAlaGly1066]8590951067]ThrProAlaSerLeuAlaAlaAlaGlyAlaValLeuAlaArgLeuHis1068]1001051101069]ThrValProGlyGluHisPheProAlaValAlaHisAsnLeuProArg1070]1151201251071]ArgArgAspArgTyrArgThrAlaValArgCysAlaArgAlaAlaGly1072]1301351401073]LeuAlaProGlyGlyLeuAlaHisArgCysLeuLeuArgAlaAlaAsp1074]1451501551601075]AspTrpArgArgSerArgGluValAlaAlaHisGlyAspPheArgThr1076]1651701751077]ProAsnLeuValValArgGlyArgGlyValArgAlaValLeuAspTrp[1078]180185190Ser Asp Ala Arg Ala Ala Ser Pro Glu Ser Asp Leu Gly Gln Leu Gly195200205Pro Gly Gln Leu Arg Pro Leu Leu Arg Gly Tyr Leu Asp Arg Ala Arg210215220Arg Ala Pro Asp Leu Glu Leu Val Ala Gly His Met Leu Ala Arg His225230235240 [1085]Leu Ala Leu Glu Ala Ala Gly Val Phe Pro Ala Gly Thr Ser Ala Ala245250255Leu Ala Arg Arg Phe Gly Pro Gly Leu Ser Arg Gly Arg Trp Thr Val260265270Ala<210>7<211>157<212>PRT<213> 生裂鏈輪絲菌 ZJU5119(Str印toverticillium rimofaciens ZJU5119)<400>7Leu Pro Asp Arg Ser Pro Ala Ala Glu Pro Leu lie Leu Asp Val Gly151015Ser Ala Gly Gln Leu Ala Glu Leu Ala Gly Asp Leu Val Asp Leu Ala202530Gly Pro Gly Gly Ala Thr Gly Pro Trp Val Leu Thr Trp Ala His Gly354045Ala Gly Glu Pro Gly Gly Glu Pro Gly Glu Gly Gln Asn Arg Gly Pro505560Asn Gly Gly Thr Gly Gly Gly Pro Gly Gly Thr Val Ala Arg Pro Pro65707580Gly Ala Thr Val Val Arg His Gly Gly Leu Glu Val Val Thr Val Pro859095Arg Pro Pro Arg Asp Leu Gly Gly Phe Leu Asp Ala Cys Cys Arg Thr100105110Gly Pro Val Ser Gly His Pro Asp Val Thr Arg Thr lie Leu lie Leu115120125Ala Asp Pro Thr Asp Arg Asp Arg Ser Ala Ser Pro Pro Glu Ala Pro130135140His Asp Ala Pro Arg Asp Gly Ala Arg Asp Gly Arg Pro145150155<210>8<211>3351117]<212>PRT1118]<213> 生裂鏈輪絲菌 ZJU5119(Str印toverticillium rimofaciens ZJU511119]<400>81120]MetThrHisProAlaThrGlyProAlaThrGlyGlyArgAspArgTyr1121]1510151122]LeuPhelieArglieLeuGluAlaCysAsnAlaAspCysPheMetCys1123]2025301124]GluPheAlaLeuSerArgAspThrTyrArgPheThrLeuAspAspPhe1125]3540451126]ArgGluLeuLeuProGinAlaGinGluSerGlyValArgTyrValArg1127]5055601128]PheThrGlyGlyGluProLeuMetHisGlyGluValLeuAspLeulie1129]657075801130]ArgGluGlyThrAlaAlaGlyMetArgMetSerLeulieThrAsnGly1131]8590951132]PheArgLeuProGinMetValAspLysLeuAlaGluAlaGlyLeuAla1133]1001051101134]GinVallieValSerLeuAspGlySerSerGlyGluThrHisAspVal1135]1151201251136]TyrArgArgThrProGlyMetPheAspArgGlyLeuAspGlyLeuVal1137]1301351401138]ArgAlaSerArgAlaGlyMetLeuThrArgValAsnThrValValGly1139]1451501551601140]ProHisAsnPheAlaGinMetProGluLeuGinArgValLeuThrGlu1141]1651701751142]AlaArgValGluGinTrpGluMetSerAlaLeuLysLeuGluArgHis1143]1801851901144]lieAlaTyrProProAlaGluGluValLeuHisAlaCysGluProVal1145]1952002051146]PheLeuAlaAspProLysArgTrpLeuValProLeuGlyLysArgPhe1147]2102152201148]TyrGlyGluThrAlaGluGluArgGluAlaPhePheGluArgGlyThr1149]2252302352401150]ThrProSerAlaSerArgProLeuCysHisValThrAspAspValMet1151]2452502551152]TyrLeuAspProLysLeuGlyArgThrPheAlaCysSerCysLeuPro1153]2602652701154]HisArgAspGlyProGlyAlaAspMetArgAspGluArgGlyArgVal1155]275280285[1156]Phe Leu Asn Ser Pro Ser Phe Arg Ala His Ala Glu Glu Phe Lys Gln290295300Gln Gly Pro Val lie Cys Ser Gly Cys Ser Thr Thr Ala Ala Gly Tyr305310315320Ser Asp Asp Val Ala Arg Leu Gly Ser Val Pro Ala Trp His Tyr325330335<210>9<211>740<212>PRT<213> 生裂鏈輪絲菌 ZJU5119(Str印toverticillium rimofaciens ZJU5119)<400>9Met lie Leu Arg Thr Asp His Val Asp Ala Tyr Leu Ser Ala Val Ser151015Ala lie Leu Asp Glu Pro Gly Arg Ala Gly Ala Gly Val Pro Val Leu202530Cys Arg Pro Gly Ser Pro Leu Asp Val Leu Val Thr Arg Trp Ser Ala354045Leu Leu Gly His Ala Gly Pro Arg Ala Arg Ser Asp Arg Pro Gly Arg505560Ala Val Val Ala Val Gly Asp Asp Pro Val Val Ser Ala Ala Ala Arg65707580Leu Leu Ala Val Leu Thr Gly Arg Thr Ala Leu Ala Val Ala Asp Val859095Lys Glu Leu Pro Ala Leu Trp Glu Arg His Asp Leu Val Ser Thr Ala100105110Leu Val Gly lie Gly Thr Gly Phe Asp Val Pro Gly Val Glu Pro Ser115120125Ala Phe Trp Arg Leu Asp Ala Thr Asp Ala Thr Leu Gly lie Leu Thr130135140Gly Arg Asp Arg Glu Ser Leu Thr Trp Phe Val Ala Lys Ser Leu Leu145150155160Thr Ser Thr Val Pro Gly Asp Ala Gln Thr Leu Leu Leu Pro Asp Arg165170175Lys Pro Arg Glu Asp Thr Ala Ser Ala Gly Val Gly Ala Gly Gly Val180185190Glu Val Leu Tyr Gly Ala Ala Ala Glu Glu Ala Leu Pro Ala Leu Ala195200205Glu Asp Glu Arg Val Arg Ala Leu lie Ala Val Glu Ala His Gly Arg210215220.[1195AlaAspHisLeuGlyValArgAspGlylielieCysGlyAspArgLeu[1196225230235240[1197AlaHisLeuGlyArgSerSerGluProGluGlylieGlyArgValPro[1198245250255[1199GinCysAlaPheGlyHisGlyCysPheLysProGlyAlaArgValAla[1200260265270[1201lieSerArgMetProAlaGinSerLeuPheLeuHisSerCysThrSer[1202275280285[1203SerHisThrGluAlaAspMetTyrGluLysSerPheLeuLeuGlyLeu[1204290295300[1205AlaAlaLeuGluGlyProAlaArgHisValLeuGlyThrValArgPro[1206305310315320[1207MetHisAspGlyGlyHisGluValGlyLeuValSerAlaLeuThrAla[1208325330335[1209AlaGlyAlaSerAlaGlyGluValThrArgLeuLeuAsnAlaSerTyr[1210340345350[1211HisGinHisArgGlyGluProAlaProTyrLeuLeuLeuGlyAspPro[1212355360365[1213GluLeuProPheAlaAspGlyProValGlyGlyProAspAlaGlyPro[1214370375380[1215AlaValGluLeuAspAlaSerAlaGlyAlaLeuProLeuGlyGlyArg[1216385390395400[1217ArgThrAlaValLeuGlySerGlyProGlyValLeuValValGlyAsp[1218405410415[1219AlaThrGlyAspGluAspGlyAspGlyProGlyLeuProAlaGlyVal[1220420425430[1221GlyAlaLeuThrValArgArgGlyAspArgThrAspValValAlaTrp[1222435440445[1223SerThrGluGlyProLeuProGluGlyAlaLeuProLeuValArgArg[1224450455460[1225GluGlyGlyAlaValAlaAlaAspGlyGlyAlaGluGluLeuHisAla[1226465470475480[1227ArgTrpAspHisValAspHisGlylieAlaSerGlyGlyAlaLeuGly[1228485490495[1229LeuLeuProLysAspLeuThrGlyArgLeuGinGluLeuArgAspLeu[1230500505510[1231AlaAlaAlaValGlyThrAlaAspArgAspAlaArgPhePheProGly[1232515520525[1233ArgLeuGlyAlaValArgArgAlaAlaAlaArgLeuAspGinArglie[1234]530535540Arg Asp Ala Asp Arg Ala Leu Met His Ala Leu Leu Gly Arg Asn Gly545550555560Lys Pro Phe Asp Ala Asp Asp Arg Leu Glu Ser Ala Phe Val Pro Leu565570575Glu Ser Gln Tyr Gly Arg Gln Val Cys Trp Cys Gly Arg Asp Ala Val580585590Val Ser Arg Leu Arg Pro Arg Leu Gly Ala Arg Glu Val Arg Arg Lys595600605Tyr Asn Cys Met Gln Cys Gly Asp Tyr Ala Gln Val Ala Val Asp Gly610615620Val Asp Val Arg Trp Glu Ala Pro Glu Phe Val Ala Ser Gly Gly Glu625630635640Leu Glu His Ser Phe Arg lie Ala Asn Pro Leu Pro His Pro Val Thr645650655Gly Val Leu Ala Leu Ser Val Ser Pro Trp Tyr Gly Gly Asp Val Ser660665670Phe Arg Pro Gly lie Ala Thr Phe Ser Val Ala Pro Gly Gly Thr Cys675680685Arg Val Gly Val Thr Met Arg Ala Ala Gly Leu Lys Pro His Arg Tyr690695700Thr Val Asp Ala Thr Val Val Ser His Leu Arg lie Asn Ala Tyr Arg705710715720Lys Phe Val Gln Val Arg Pro Ala Gly Pro Val Gly Pro Ser Asp Glu725730735Asp Gly Ala Leu740<210>10<211>360<212>PRT<213> 生裂鏈輪絲菌 ZJU5119(Str印toverticillium rimofaciens ZJU5119)<400>10Val Thr Ala Pro Thr Thr Gly Pro Thr Thr Gly Pro Thr Thr Gly Pro151015Thr Ala Gly Pro Thr Thr Gly Pro Thr Ala Gly Pro Thr Ala Gly Pro202530Thr Ala Gly Ser Ala Val Ala Glu Glu Ala Val Ala Glu Trp Ala Ala354045 [1272]Ala Trp Leu Glu Gln Val His Gly Val Arg Tyr Gly Pro Asp Asp Ala[1273]505560Leu Phe Gly Ser Leu Asp Ser Leu Ala Leu Thr Glu Leu Leu Val Ala65707580Cys Glu Ala His Phe Gly Leu Arg lie Asp Glu Gly Phe Gly Trp Gln859095Ala Leu Ala Ser Val Arg Ser Leu Ala Ala His Val Ala Thr Gly Val100105110Arg Pro Pro Ser Asp Arg Val Trp Phe Arg Ser Gly Pro Gly Ala Thr115120125Gly Asp Ala Asp Leu Asp Arg Thr Ala Val Val Arg Val Ala Leu Gly130135140Leu Pro Pro Gly Ala Ala Val Ala Arg Leu Ser Pro Arg Glu Leu Ala145150155160Leu Gly lie Gly Ala Ala Ala Ala Pro Ser Glu Lys Pro Ala Thr Thr165170175Leu Pro Pro Glu Arg Glu Arg Leu Ser Leu Ala Arg Glu Ser Ser Thr180185190Arg Pro Gly Ser Leu Leu Ala Val Gly Ala Thr Ala Ala Arg lie Arg195200205Ala Phe Ala Gly Arg Leu Asp Ala Ala Leu Ala Ala Val Gly Ala Thr210215220Pro Val Trp Tyr Pro lie Thr Thr Asp Ser Pro Val Gly Ala Asp His225230235240Val Gln Gly lie Pro Ser Glu Leu Thr Ala Gly Arg Leu Gly His Ala245250255Gly Cys Leu Gln Leu Leu Ala Glu Leu Pro Ala Glu Arg Asp Val Val260265270Tyr Ser Gly lie Ala Tyr Ala Phe Arg Asp Glu Pro Gly Arg Arg Trp275280285Glu Pro Ala Gly Arg Leu Glu Ala Tyr Arg Val His Glu Thr Val Val290295300His Gly Thr Glu Glu Phe Arg Thr Ala Met Trp Arg Arg Leu Tyr Glu305310315320Leu Val Asp Arg Glu Leu Ser Ala Leu Gly Pro Gly Gly Trp Gln Glu325330335Gly Arg Asp Gly Phe Thr Pro Arg His Gly Pro Gln Ala Arg Met Ala340345350Ala Gly Thr Arg Arg Trp Ala Arg[1311]355360[1312]<210>11<211>317<212>PRT<213> 生裂鏈輪絲菌 ZJU5119(Str印toverticillium rimofaciens ZJU5119)<400>11Met Asp Ala Ala Pro Gly Thr Ala Arg Thr Ala Ala Gly Thr Ser Val151015Pro Pro Val Leu Pro Val Asp Ala Glu Arg Pro Ala Ala Arg Arg Thr202530Leu Ala Met Glu Glu Gly Thr Pro Arg Gln Trp Glu Gly Leu Gly Leu354045His Gly Val Pro Glu Ala Val Glu Ala Ala Leu Gly Pro Ala Ala Glu505560Leu Val Val Ala Ala Arg Gly Gly Gly Arg Ser Pro Leu Pro Gly Leu65707580Val Phe Ala Gln Pro Cys Leu Gly Arg Ser Ala Gly Val Ala Arg Asp859095Leu Pro Val Ser Val Val Trp Glu Thr Gly Val Ala Leu Ala lie Ala100105110Arg Ala Leu Asp Arg Pro Ala Val lie Gly Leu Cys Val Tyr Glu Glu115120125lie Leu Gln Gln Pro His Arg Asp Ala Glu Phe Thr Ala Leu Gly Ala130135140Ala Val Ala Arg Thr Val Glu Ala Leu Gly Arg Leu Leu Gly Val Ala145150155160Val Thr Ala Arg Val Glu Thr Ala Ala Pro Arg Ala Ala Glu Val Pro165170175Ala Arg Arg Leu Tyr Gly Leu Tyr Thr Pro Phe Ser Glu Ser Thr Tyr180185190Pro Arg Gly Phe Pro Asn Glu Ala Glu Val Leu Arg Ala Phe Ser Ala195200205Tyr Cys Gly Arg Tyr Glu Asp Ala Ala Arg Arg Glu Ala Ser Leu Trp210215220Val Thr Glu Gly Val His Leu Ala Lys Ala Ala Leu Leu Gly Leu Gly225230235240Pro Gly Val Pro Phe Leu Ala Thr Thr Pro Leu Pro Asp Pro Ala His245250255Pro Gly Arg Leu Leu Gln Asp Ala Pro Ala Ala Thr Arg Val Thr Leu260265270[1351]Glu Arg Arg Ser Ala Leu Pro Ala Asp Trp Trp Pro Glu Gln Ala Leu275280285Glu Arg Ala Leu Gly Thr Gly Leu Arg Arg Leu Thr Glu Asp Phe His290295300Ala Leu lie Glu Asp Phe His Asp Pro Ala Gly Asp Arg305310315<210>12<211>442<212>PRT<213> 生裂鏈輪絲菌 ZJU5119(Str印toverticillium rimofaciens ZJU5119)<400>12Met Arg Thr Pro Arg Thr Gly Ala Val Leu Gly Gly Arg Gly Pro Ala151015Leu Pro Phe Val Ala Tyr Met Ala Leu Ser Asn Ala Gln Phe Thr Arg202530Gly Val Phe Val Leu Phe Leu Leu Arg Gly Asn lie Ser Leu Ala Glu354045Val Gly Leu Leu Glu Ser Leu Phe His Leu Thr Arg Val Leu Cys Glu505560Val Pro Ala Gly Ser Val Ala Asp Arg Trp Gly Arg Arg Arg Thr lie65707580Gln Ala Gly Leu lie Leu Ser Ala Ala Ala Met Pro Ala Phe Leu Leu859095Gly Gly Met Phe Trp Tyr Ala Leu Ala Phe Val Phe Gln Gly Ala Gly100105110Trp Ala Ala Gln Arg Gly Ala Asp Thr Ala Leu Leu Tyr Glu Leu Leu115120125Glu Arg Thr Gly Gly Thr Asp Arg Tyr Ala Arg lie Leu Gly Arg Ser130135140His Ala Ala Ser Tyr Gly Thr Leu Ala Leu Thr Thr Ala Leu Gly Ala145150155160Met Leu Tyr Gln Arg His Val Ser Leu Pro Phe Trp Leu Gln Ala Ala165170175Val Thr Leu Leu Ala Val Gly Ala lie Gly Val Leu Pro Glu Ser Ser180185190Gly Thr Ala Ala Ser Gly Ala Gly Ser Ser Gly Ser Gly Ser Ser Gly195200205Glu Pro Ala Glu Arg Pro Met Gly Val Trp Arg Leu Ala Arg Ala Gly210215220[1390]Ala Arg Leu Val Val Gly His Pro Val Leu Arg Leu Phe Val Ala Phe225230235240Val Ala Leu Val Glu Ala Gly Thr Thr Val Val Ser lie Phe Ser Gln245250255Ser Phe Phe Arg Thr Leu Gly Tyr Gly Thr Ala Thr Thr Gly Leu lie260265270Leu Ala Leu Val Thr Ala Phe Ser Ala Ala Ala Ala Leu Gln Ser His275280285 [1398]Arg Leu Val Glu Arg Gly Pro Val Arg Val Leu Met Ala Ala Ser Ser290295300Leu Tyr Leu Val Gly Leu Ala Gly Met Ala Ser Leu Gln Pro Gln Leu305310315320Ala Val Val Gly Tyr Tyr Leu Val Phe Leu Asn Leu Asp Leu Leu Ala325330335Pro Val Leu Ser Ala Phe Phe His Arg Ser Val Asp Glu Glu Val Arg340345350Ala Thr Ala Gly Ser Tyr Leu Asn Leu Ser Thr Ser Val Leu Thr Phe355360365Ala Ala Phe Pro Leu Ser Gly Ser Leu lie Asp Ala Gly Gly Tyr Arg370375380Pro Leu Leu lie lie Thr Ala Leu Val Ser Leu Pro Leu Leu Val Phe385390395400Leu Val Gly Ala Ala Arg Arg Val Leu Ser Pro Pro Glu Glu Gly Asp405410415Ser Gly Glu Asp Ala Gly Glu Arg Ala Gly Pro Lys Gly Pro Gly Ala420425430Ala Ala Pro Asp Thr Thr Thr Thr Gly Val435440<210>13<211>328<212>PRT<213> 生裂鏈輪絲菌 ZJU5119(Str印toverticillium rimofaciens ZJU5119)<400>13Met Thr Thr Arg Ala Asp Ser Pro Ser Pro Gly Ser Gly Gly Pro Val151015Gly Pro Gly Gly Ser Gly Gly Asp Asp Gly Arg Pro Val lie Ala Leu202530Arg Phe Ala Pro Ala Asp Val Glu Ala Ala Ala Ala Ala Glu Tyr Val354045[1429]Ala Ala His Leu Gly Gly Phe Arg Cys Leu Pro Glu Cys Pro Gln Glu505560Gly Asp Ser Gly Pro Gly Arg Asn Pro Pro Ala Ala Val lie Val Phe65707580Gly Arg Ser Gly Ala Ala Gly Gly Ala Gly Pro Ala Gly Val Pro Thr859095 [1435]Val Leu Val Glu Gly Ala Glu Pro Val Pro Gly Thr Asp Ala Asp Val100105110Val Cys Arg Gln Ala Pro Gly Trp Leu Thr Ala Gly Glu Pro Pro Ala115120125Pro Pro Ala Val Arg Pro Gly Gly Gly Arg lie Arg Thr Val Asp Val130135140Ala Ala Val Ala Pro Phe Arg Gln Val Arg Ser Gly Gly Gly Gly Gly145150155160Arg Ala Ala Leu Leu Leu Gly Gly Ala Gly Gly Pro Asp Gly Ser Gly165170175Ala Ser Ala Gly Gly Glu Ala Leu Pro Gly Ala Leu Ala Arg Phe lie180185190Ala Gly His Pro Ala Ala Ala Gly Asp Ala Trp Ala Val Leu Thr Asp195200205Leu Thr Gly Glu Pro Leu Arg Glu Leu Leu Gly Leu Leu Pro Pro Thr210215220Ala Arg Thr Val Gly Ala Ala Asp Trp Ala Gln Val Leu Arg Arg Ala225230235240Asp Ser Leu Val Ala Thr Pro Thr Leu Leu Ala Ala Ala His Ala Arg245250255Thr Ala Arg lie Pro Leu His Val Leu Asp Pro Ala Gly Pro Ala Gln260265270Arg Arg Val His Arg Ala Leu Ala Ala lie Ala Gly Ala Pro Gly Glu275280285Pro Gly Gly Leu Pro Val Val Gly Pro Asp Asp Trp Pro Arg Asp Asp290295300Gly Arg Ala Gly Ala Leu Gly Gly Ala Ala Gln lie Ala Arg Gln Val305310315320Arg Gln Leu Cys Leu Ala Pro Ala325<210>14<211>389<212>PRT[1468]<213> 生裂鏈輪絲菌 ZJU5119(Str印toverticillium rimofaciens ZJU5119)<400>14Met Ser Asp Thr Leu Ala His Asn Arg Pro Leu Asp Leu Thr Gln His151015 [1472]Glu lie Ala Ala Leu Arg Ser Glu His Asn Leu Ala Asp Ala His Thr202530His Gln Tyr Gln Ser Pro Ala Gln Gln Leu lie Val Asp Ser Leu Pro354045Ala Leu Trp His Glu Ala Glu Lys Gly Arg Gln Ala Asp Phe Glu Gln505560Arg Phe lie Glu Ala Phe Phe Arg Leu His Gly Gln Pro Thr Ala lie65707580Gly Leu Asp Arg Thr Leu Leu Thr Tyr Ala Ala Ser lie Ser Thr Met859095lie Ala Gly Met Phe Leu Lys Arg Arg Asp Ala Arg Val Thr Leu Val100105110Glu Pro Cys Phe Asp Asn Leu Pro Asp Leu Leu Val Asn Leu Gly Val115120125Pro Leu Thr Ala Leu Pro Glu Asp Ala Leu Arg Asp Pro Ala Arg lie130135140His Arg Glu Leu Ser Arg Leu Val Thr Thr Glu Ala Leu Phe Leu Val145150155160Asp Pro Asn Asn Pro Thr Gly His Ser Leu Phe Ala Asp Gly Met Arg165170175Gly Phe Glu Glu Val Val Arg Phe Cys Arg Glu Arg Gly Thr Val Leu180185190Val Leu Asp Leu Cys Phe Ala Ala Phe Ala Leu Gly Ser Gly Gly Pro195200205Gly Arg His Asp Val Tyr Glu Leu Leu Glu Asn Ser Gly Val Thr Tyr210215220lie Ala Met Glu Asp Thr Gly Lys Thr Trp Pro Val Gln Asp Ala Lys225230235240Cys Ala Leu Leu Thr Thr Ser Ala Asp lie Tyr Pro Ala Val Tyr Asn245250255Leu His Thr Ser Val Leu Leu Asn Val Ser Pro Phe lie Leu Asn Thr260265270Leu Thr Arg Tyr lie Glu Asp Ser Arg Arg Asp Gly Phe Ala Ser Val275280285Thr Asp Val Leu Glu Arg Asn Arg Lys Ser Leu Arg Ala Ala Thr Glu[1507]290295300Gly Thr Val Leu Arg Ala His Glu Pro Asp Val Pro Val Ser Val Ala305310315320Trp Phe Thr lie Asp Asp Arg Gly Pro Asp Ala Thr Gln Leu Gln Arg325330335Asp Leu Ser Gly His Gly lie His Val Leu Pro Gly Thr Tyr Phe Tyr340345350Trp Asn Glu Pro Ser Arg Gly Glu Arg Tyr Val Arg Val Ala Leu Ala355360365Arg Asp Pro Gly Glu Phe Asp Ala Ser Met Ala Arg Leu Arg Thr Leu370375380Leu Ala Arg Tyr Ala385<210>15<211>258<212>PRT<213> 生裂鏈輪絲菌 ZJU5119(Str印toverticillium rimofaciens ZJU5119)<400>15Met Arg Glu Pro Gly Leu lie Ala Pro Leu Val Thr Pro Leu Thr Pro151015Asp Gly Ala Val Ser Glu Ala Cys Val Arg Ala Gln Val Ala Arg Val202530Arg Pro Tyr Val Arg Ala Leu Met Pro Gly lie Ser Cys Gly Glu Gly354045Trp Leu Leu Asp Arg Pro Arg Trp Glu Arg Leu Ala Ala Ala Val Leu505560Asp Cys Arg Asp Gly Leu Pro Val His Leu Gly Val Gln Ala Ala Asp65707580Thr Ala Glu Val lie Arg Arg Ala Arg Trp Ala Val Arg His Gly Ala859095Asp Ala Val Thr Val Gly Pro Pro His Gly Ala Gly Ala Arg Gln Arg100105110Ala Val His Glu His Phe Ala Arg Val Cys Ala Ala Val Asp Thr Pro115120125Val Cys Val Tyr His Glu Ser Val Val Ser Gly Thr Arg Met Thr Pro130135140Ala Thr Leu Thr Ala Val Cys Arg Leu Asp Gly Val Arg Ala Val Lys145150155160Glu Ser Gly Arg Glu Pro Ser Val Thr Asn Asp Leu lie Ala Ala Val.[1546]165170175Pro Asp Val Ala Val His Gln Gly Trp Glu Asp Leu Phe His Ala Thr180185190Pro Gly Ala Ala Gly Leu lie Gly Pro Leu Val Leu lie Asp Pro Ala195200205Leu Cys Ala Glu Leu Val Ala Gly Val Gly Gly Val Gln Gly Val Val210215220Thr Asp Arg Cys Arg Glu Leu Gly Leu Phe Arg Pro Asp Tyr Val Ala225230235240Arg Thr Lys Arg Glu Leu Cys Arg Leu Gly Val Leu Ala His Ala Val245250255Thr Leu<210>16<211>356<212>PRT<213> 生裂鏈輪絲菌 ZJU5119(Str印toverticillium rimofaciens ZJU5119)<400>16Val Cys Pro Leu Thr Glu Ala Arg Leu Thr Gly Arg Ala Arg Ala Val151015Tyr Leu Glu lie Leu Arg Ala Gly Gly Ser Leu Pro Arg Thr Ala Leu202530Arg Thr Pro Gly Pro Pro Glu His Ser Gly Glu Gly Thr Asp Ala Asp354045Gly Asp His Glu Leu Thr Glu Ala lie Asp Ala Leu Val Ala Leu Arg505560Leu lie Gln His Thr Asp Arg Gly Arg Leu Leu Ala Ala lie Ser Pro65707580Gln Ser Ala Ala Ala Ala Leu Ser Ala Val Arg Glu Gly Glu lie Gln859095Arg Gln Arg Leu Glu Asp Glu Arg Leu Arg Ser Ala Met Ala Ser Leu100105110Gln Asp Ala Tyr Asp Ala Val Asn Glu Gly Arg Ala Arg Lys Ala Pro115120125Gln lie Glu Ser Leu Thr Asp lie Ser Thr lie Arg Gly Leu Leu Ser130135140Ala Ala Ala Arg Asp Cys Arg His Glu Val Leu Thr Ala Gln Pro Glu145150155160Ala Leu Leu Glu Ser Thr Leu Ala Asp Ser Arg Pro Arg Asp Leu Ser[1584]165170175[1585]Leu Leu Thr Arg Gly lie Ala lie Arg Thr Val Tyr Pro His Thr Val180185190Leu Ser Ser Pro Ala Val Gln Gln His Phe Ser Leu Met His Glu Ala195200205Gly Thr Gln lie Arg Thr Thr Thr Gly Val Leu Asp Arg Val Val lie210215220Phe Asp Gln Ser Leu Ala Phe Leu Ala Asp Arg Arg Ser Asp Gly Pro225230235240Gly Ala Val Val lie Arg His Pro Ala Val Val Asp Tyr Leu Tyr Arg245250255Thr lie Glu Gln Val Trp Arg Leu Ala Lys Pro Phe Val Tyr Thr His260265270Val Gly Tyr Gly Pro Ala Ala Asp Glu lie Arg Ala Gly lie Leu Arg275280285Leu Met Ala Ala Gly Ala Lys Asp Glu ValIle Ala Lys Arg Met Asn290295300Met Ser Thr Arg Thr Cys Arg Arg His lie Ala Glu Met Met Ala Glu305310315320Leu Gly Ala Glu Ser Arg Phe Gln Ala Gly Val Leu Ala Ala Asp Arg325330335Gly Leu Leu Arg Leu Ser Gly Gly Pro Pro Pro Leu Arg Gly Phe Arg340345350Gly Leu Ser Gly355<210>17<211>527<212>PRT<213> 生裂鏈輪絲菌 ZJU5119(Str印toverticillium rimofaciens ZJU5119)<400>17Met Lys Val Ala Leu Val Gly Pro Asn Gly Ala Gly Lys Thr Thr Leu151015Leu Arg Met lie Ala Gly Asp Leu Pro Val Thr Arg Gly Ala Val Ala202530Arg Ser Gly Gly Leu Gly Val Met Arg Gln Phe lie Gly Met Val Ser354045Asp Glu Thr Thr Leu Ala Gly Leu Ala Leu Ser Leu Ser Pro Ala Gly505560Leu Arg Gly Ala Gly Glu Ala Leu Ala Arg Ala Glu Thr Ala Met Ala65707580.[1624]Val Pro Gly Ala Gly Glu Lys Ala Gln Leu Arg Tyr Ala Glu Ala Leu859095Val Ala Trp Gly Asp Ala Gly Gly Tyr Glu Gln Glu Val Val Phe Asp100105110 [1628]Thr Val Val Thr Asp lie Leu Gly Thr Pro Trp Asp Glu Ala Arg Ser115120125Arg Pro Val Arg Thr Leu Ser Gly Gly Glu Gln Lys Arg Phe Ala Leu130135140Ser Leu Leu Leu Ala Gly Pro Asp Glu Val Leu Leu Leu Asp Glu Pro145150155160Asp Asn Phe Leu Asp Val Pro Gly Lys Arg Arg Leu Glu Ala Arg Leu165170175Ala Glu Ser Pro Lys Thr Val Leu Tyr Val Ser His Asp Arg Glu Leu180185190Leu Ala Asn Thr Ala Ser Arg Val Val Thr Val Glu Gly Gly Ser Ala195200205Trp Met His Pro Gly Ser Phe Ala Ser Trp His Asp Ala Arg Val Ser210215220Arg Tyr Glu Arg Phe Glu Glu Glu Arg Arg Arg Trp Asp Glu Glu His225230235240Ala Lys Leu Lys Glu Leu Val Arg His Tyr Gln Val Lys Ala Ser His245250255Asn Asp Ala Met Ala Ser Arg Leu Gln Ala Ala Arg Thr Arg Leu Ala260265270Lys Phe Glu Ala Gln Pro Pro Pro Pro Pro Arg Pro Arg Glu Gln Asn275280285lie Arg Met Arg Leu Thr Gly Asp Arg Thr Gly Lys Arg Ala Val Val290295300Cys Glu Arg Leu Gly Leu Asp Gly Leu Thr Asp Pro Phe Gly Phe Glu305310315320Ala Trp Tyr Gly Asp Arg lie Ala Val Leu Gly Ala Asn Gly Thr Gly325330335Lys Ser His Phe Leu Arg Leu Leu Gly Arg Gly Gly Ser Asp Pro Glu340345350Leu Pro Ser Leu Thr Pro Leu Glu Pro Val Ala His Thr Gly Ser Ala355360365Arg Leu Gly Ala Arg Val Val Pro Gly His Phe Ser Gln Thr His Asp370375380Arg Pro Glu Leu Val Gly Arg Thr Leu Glu Asp lie Leu Trp Lys Gly[1663]385390395400Asp Val Arg Arg Asp Ser Leu Pro Arg Asp Glu Ala Met Ala Ala Leu405410415 [1666]Gly Arg Tyr Glu Leu Ala Gly Gln Gly Gly Gln Arg Phe Glu Thr Leu420425430Ser Gly Gly Gln Gln Ala Arg Phe Leu lie Leu Leu Leu Glu Leu Ser435440445Gly Ala Thr Leu Leu Leu Leu Asp Glu Pro Thr Asp Asn Leu Asp Leu450455460Ala Ser Ala Glu Ala Leu Glu Gln Gly Leu Ala Gly Phe Arg Gly Thr465470475480Val Leu Ala Val Thr His Asp Arg Trp Phe Thr Arg Ser Phe Asp Arg485490495Phe Leu His Phe Arg Gly Asp Gly Ala Val Lys Glu Val Thr Ala Pro500505510Val Trp Glu Pro Ala Val Val Glu Gly Ala Gly Gln Ala Gly Arg515520525<210>18<211>256<212>PRT<213> 生裂鏈輪絲菌 ZJU5119(Str印toverticillium rimofaciens ZJU5119)<400>18Val lie Glu Asp Gly Gly Ser Ala Arg Gly Ser Val Thr Thr Val Arg151015Arg Val Gly Asp Thr Val Arg Arg Pro Arg Gly Arg Trp Thr Ala Asn202530Val His Ala Leu Leu Arg His Leu Ala Asp Ala Gly Phe Leu Arg Ala354045Pro Arg Ala Leu Gly Val Asp Glu Asp Gly Ser Glu lie Leu Ser Phe505560Leu Asp Gly Glu Val Ala Met Arg Pro Trp Pro Ala Ala Leu Arg Glu65707580Arg Ser Gly Val Val Glu Leu Ala Val Trp Leu Arg Glu Tyr His Asp859095Val Val Arg Asp Phe Arg Pro Pro Cys Pro Asp Glu Trp Phe Val Pro100105110Gly Val Ser Trp Arg Pro Gly Arg Leu Val Arg His Gly Asp Leu Gly115120125Pro Trp Asn Ser Val Trp Arg Gly Ser Arg Leu Val Gly Phe lie Asp[1702]130135140Trp Asp Phe Ala Glu Pro Gly Asp Pro Leu Asp Asp Leu Ala Gln Leu145150155160Ala Trp Tyr Cys Val Pro Leu Gly Gly Arg Ala Thr Gly Ala Gly Gly [1706]165170175Glu Glu Ser Arg Val Arg Val Arg Glu Arg Leu Ala Ala Val Cys Thr180185190Ala Tyr Gly Ala Glu Pro Val Ser Val Leu Asp Ala Leu Ala Gly Leu195200205Gln Glu Arg Glu Ala Arg Arg lie Thr Asp Leu Gly Gly Arg Gly Leu210215220Glu Pro Trp Thr Ser Phe Leu Ala Arg Gly Asp Ala Thr Ala lie Glu225230235240Glu Glu Arg Ala Trp Leu Leu Thr His Arg Glu Gly Leu Leu Val Gly24525025權(quán)利要求
一種米多霉素生物合成基因簇,其特征在于,序列如SEQ ID NO1所示。
2.根據(jù)權(quán)利要求1所述的米多霉素生物合成基因簇,其特征是,所述基因簇包含16個(gè) 基因結(jié)構(gòu)基因 11 個(gè):milA, milB, milC, milD, milE, milG, milH, milj, milM, milN 和 milQ ; 其中所述基因milA,位于SEQ ID NO :1的第6125 7126位, 所述基因milB,位于SEQ ID NO 1的第7252 7761位, 所述基因milC,位于SEQ ID NO 1的第7906 9165位, 所述基因milD,位于SEQ ID NO 1的第9185 10369位, 所述基因milE,位于SEQ ID NO :1的第10380 11198位, 所述基因milG,位于SEQ ID NO 1的第11627 12631位, 所述基因milH,位于SEQ ID NO 1的第12729 14948位, 所述基因milJ,位于SEQ ID NO :1的第16202 17152位, 所述基因milM,位于SEQ ID NO 1的第19548 20714位, 所述基因milN,位于SEQ ID NO 1的第20710 21483位, 所述基因milQ,位于SEQ ID NO 1的第25168 25935位; 調(diào)節(jié)基因2個(gè)milK和milO ;其中 所述基因milK,位于SEQ ID NO 1的第17152 18477位, 所述基因mi 10,位于SEQ ID NO 1的第23289 22222位; 抗性基因1個(gè):mi IP ;所述基因milP,位于SEQ ID NO 1的第23298 24878位; 其他基因2個(gè)milF、milI ;其中所述基因milF,位于SEQ ID NO :1的第11194 11664位, 所述基因mill,位于SEQ ID NO 1的第14948 16027位。
3.根據(jù)權(quán)利要求2所述的米多霉素生物合成基因簇,其特征是,所述11個(gè)結(jié)構(gòu)基因編 碼的蛋白具體為所述基因milA編碼的蛋白的序列如SEQ ID NO :2所示,該蛋白為CMP羥甲基轉(zhuǎn)移酶; 所述基因milB編碼的蛋白的序列如SEQ ID NO :3所示,該蛋白為CMP/羥甲基水解酶; 所述基因milC編碼的蛋白的序列如SEQ ID NO :4所示,該蛋白為胞嘧啶/羥甲基胞嘧 啶葡萄糖醛酸合成酶;所述基因milD編碼的蛋白的序列如SEQ ID NO 5所示,該蛋白為degT/dnrT/eryCl/ strS氨基轉(zhuǎn)移酶;所述基因milE編碼的蛋白的序列如SEQ ID NO :6所示,該蛋白為氨基糖苷類磷酸轉(zhuǎn)移酶;所述基因milG編碼的蛋白的序列如SEQ ID NO :8所示,該蛋白為RadicalSAM蛋白; 所述基因milH編碼的蛋白的序列如SEQ ID NO :9所示,該蛋白為連接酶; 所述基因milj編碼的蛋白的序列如SEQ ID NO :11所示,該蛋白為精氨酸羥化酶; 所述基因milM編碼的蛋白的序列如SEQ ID NO :14所示,該蛋白為Asp/Tyr/Aro氨基 轉(zhuǎn)移酶;所述基因milN編碼的蛋白的序列如SEQ ID NO 15所示,該蛋白為二氫二吡啶甲酸合酶;所述基因milQ編碼的蛋白的序列如SEQ ID NO 18所示,該蛋白為氨基糖苷磷酸轉(zhuǎn)移酶。
4.根據(jù)權(quán)利要求2所述的米多霉素生物合成基因簇,其特征是,所述2個(gè)調(diào)節(jié)基因編碼 的蛋白具體為所述基因milK編碼的蛋白的序列如SEQ ID NO :12所示,該蛋白為Majorfacilitator superfamily家族蛋白;所述基因milO編碼的蛋白的序列如SEQ ID NO :16所示,該蛋白為L(zhǎng)uxR家族調(diào)節(jié)蛋白。
5.根據(jù)權(quán)利要求2所述的米多霉素生物合成基因簇,其特征是,所述1個(gè)抗性基因編碼 的蛋白具體為所述基因milP編碼的蛋白的序列如SEQ ID NO 17所示,該蛋白為ABCtransporter。
全文摘要
一種生物技術(shù)領(lǐng)域的米多霉素生物合成基因簇,該基因簇的序列如SEQ IDNO1所示;該基因簇包含16個(gè)基因結(jié)構(gòu)基因11個(gè)milA,milB,milC,milD,milE,milG,milH,milJ,milM,milN和milQ;調(diào)節(jié)基因2個(gè)milK和milO;抗性基因1個(gè)milP;其他基因2個(gè)milF、milI。本發(fā)明提供了米多霉素生物合成相關(guān)的所有基因和蛋白信息,為生物合成米多霉素和遺傳改造提供了基礎(chǔ);本發(fā)明的米多霉素生物合成基因簇及其蛋白可廣泛用于農(nóng)業(yè)、工業(yè)和醫(yī)藥領(lǐng)域。
文檔編號(hào)C12N15/52GK101812472SQ200910056338
公開日2010年8月25日 申請(qǐng)日期2009年8月13日 優(yōu)先權(quán)日2009年8月13日
發(fā)明者徐志南, 李力, 賀新義, 鄧子新 申請(qǐng)人:上海交通大學(xué)