Full-Length Enriched cDNA Libraries
For the RIKEN full-length cDNA discovery program we are using full-length cDNA libraries prepared by combining several technologies. The first is the elongation method, which allows preparing very long 1st strand cDNAs, is based on the trehalose-thermostabilized reverse transcriptase. We developed also the cap-trapper, a method to separate and select full-length cDNAs from truncated ones. Further, we developed a normalization/subtraction method, which reduces the frequency of highly expressed cDNAs and removes cDNAs that have already been selected by one-pass sequencing. This notably reduces the sequencing redundancy allowing us to increase new gene discovery rate. Additionally, cloning vectors, which are able to preferentially clone large cDNAs with bulk excision capability, have been developed and used to clone long, rare cDNAs. To extend the full-length cDNA discovery to any tissue, we have been adapting these protocols for micro-scale tissues, such as preimplatation embryos. To further facilitate and improve the yield of the sequencing operation, we also prepared cDNA libraries devoid of homopolymers (such as the polyA and G-tailing). Later in the project, cDNA libraries were prepared with mixed tissues. The tissue from which cDNA clones were originated can be recognized by the presence of a tag next to the 3f end cloning site. To isolate a full-length cDNA from most of the genes, we prepared and characterized cDNA libraries from about 200 tissues and cell types.
- Carninci P., Shibata Y., Hayatsu N., Sugahara Y., Shibata K., Itoh M., Konno H., Okazaki Y., Muramatsu M. and Hayashizaki Y., Normalization and subtraction of cap-trapper selected cDNAs to prepare full-length cDNA libraries for high-rate new gene discovery, Genome Res., 10, 1617-1630 (2000)
- Carninci P. and Hayashizaki Y.,
High-efficiency full-length cDNA cloning,
Methods Enzymol. 303, 19-44 (1999)
- Carninci P., Nishiyama Y., Westover A., Itoh M.,
Nagaoka S., Sasaki N., Okazaki Y., Muramatsu M. and
Hayashizaki Y., Thermostabilization and
thermoactivation of thermolabile enzymes by trehalose
and its application for the synthesis of full length cDNA,
Proc. Natl. Acad. Sci. USA. 95, 520-524 (1998)
- Carninci P., Kvam C., Kitamura A., Ohsumi T.,
Okazaki Y., Itoh M., Kamiya M., Shibata K, Sasaki N.,
Izawa M., Muramatsu M., Hayashizaki1 Y. and
Schneider C., High-efficiency full-length cDNA cloning
by biotinylated CAP trapper, Genomics, 37,327-336
(1996)
- Carninci P., Westover A., Nishiyama Y., Ohsumi T.,
Itoh M., Nagaoka S., Sasaki N., Okazaki Y.,
Muramatsu M., Schneider C. and Hayashizaki Y., High
efficiency selection of full-length cDNA by improved
biotinylated cap trapper, DNA Res., 4, 61-66 (1997)
[References] [top]
The RISA Sequencing Line
To construct our high-throughput sequencing pipeline, we developed a complete 384-channel format sequencing pipeline, the RISA sequencing line, which can handle 40,000 samples/17.5 hrs. The RISA sequencing line consists of colony picking, template preparation, sequencing reaction and the sequencing reading. We have developed accessory instruments such as the RISA inoculator, RISA filtrator and densitometer, RISA plasmid preparatory. The system includes also RISA thermal cyclers, which have four 384-well sites, and the novel high throughput 384-format capillary sequencing system (RISA sequencer). All of these parts are described in more details in the following sections.
- Shibata K., Itoh M., Aizawa K., Nagaoka S., Sasaki N., Carninci P., Konno H., Akiyama J., Nishi K., Kitsunai T., Tashiro H., Itoh M., Kikuchi N., Ishii Y., Nakamura S., Hazama M., Nishine T., Harada A., Yamamoto R., Matsumoto H., Sakaguchi S., Ikegami T., Kashiwagi K., Fujiwake S., Inoue K., Togawa Y., Izawa M., Ohara E., Watahiki M., Yoneda Y., Ishikawa T., Ozawa K., Tanaka T., Matsuura S., Okazaki Y., Muramatsu M., Inoue Y. and Hayashizaki Y., RIKEN integrated sequence analysis system (RISA system) - 384-format sequencing pipeline with 384 multi-capillary sequencer, Genome Res., 10, 1757-1771 (2000)
[References] [top]
High-throughput Plasmid Preparation System
This system is based on the modified filtration method (1,2). Cell harvesting, alkaline lysis, and plasmid purification are performed on a single 96-filter plate, followed by collection of the sequence-ready DNA samples with elution buffer. The plates are designed to allow for injection of all of reagents from the top-side of the 96 columns and the used reagents are vacuum sucked from the bottom side of the plates. Use of this principle enabled us to build a plasmid preparator which employs a linear process, thus simplifying the automation.
This system consists of the RISA inoculator, the RISA densitometer/filtrator, and the RISA plasmid preparator. The RISA inoculator dispenses appropriate volume of medium and inoculate a set of 96-well E.coli clones. The throughput of dispensing and inoculation for one 96-well deepwell plate is ca. 45 sec. RISA densitometer/filtrator measures the optical density for each well of deepwell plate after cultivation and check the failed wells, followed by the transferring culture from deepwell plates to filter plates. Transferred culture liquid is aspirated by vacuum to harvest the E.coli cells on the filter. The RISA plasmid preparator has 27 stages and reagent dispensers at several stages. Each cell-harvested filter plate is sent to the first stage from the plate stacker, then is sent to next stage at fixed interval time. On each stage, various treatments, such as dispensing and/or vacuum filtration are carried out at fixed interval time. Finally, treated and dried filter plates are stacked and sent to the receiver stacker. The instrument prepares 40,000 samples in 17.5 hours.
- Itoh M, Kitsunai T, Akiyama J, Shibata K, Izawa M, Kawai J, Tomaru Y,
Carninci P, Shibata Y, Ozawa Y, Muramatsu M, Okazaki Y, Hayashizaki Y.,
Automated filtration-based high-throughput plasmid preparation system.
Genome Res., 9, 463-470. (1999)
- Itoh M., Carninci P., Nagaoka S., Sasaki N., Okazaki Y., Ohsumi T.,
Muramatsu M. and Hayashizaki Y., Simple and rapid preparation of plasmid
template by a filtration method using microtiter filter plates, Nucleic
Acids Res., 25, 1315-1316 (1997)
[References] [top]
A Novel Control System of Polymerase Chain Reaction using RIKEN GS384 Thermalcycler (RISA thermal cycler)
The RIKEN GS384 thermalcycler has four 384-well heat blocks. Four blocks are controlled independently and each lid, which is heated to avoid evaporation of the samples, is controlled by the front panel operation. The temperature of the 1536 sample wells is controlled accurately without temperature variability among the wells and evaporation is not a problem even for samples of very small volume (2 microliters).
- Sasaki N, Izawa M, Shimojo M, Itoh M, Nagaoka S, Carninchi P, Okazaki Y, Muramatsu M and Hayashizaki Y., A novel control system for polymerase chain reaction using a RIKEN GS384 thermalcycler. DNA Res., 4, 387-391 (1997)
[References] [top]
The RISA sequencer system -- a high throughput 384-format capillary sequencer system --
The RISA sequencer system is a novel high throughput 384-format capillary sequencer system developed for the RISA sequencing line. This system consists of (1) a 384 multi-capillary auto sequencer (RISA sequencer), (2) a 384 multi-capillary array assembler (CAS), and (3) a 384 multi-capillary gel casting device (GVT). The RISA sequencer can simultaneously analyze 384 independent sequencing products. For long read sequencing, 380 out of 384 sequences (99.2%) were successfully analyzed and the average read length, with more than 99% accuracy, was 654.4 bp. A single RISA sequencer can analyze 216 kb with more than 99% accuracy in 2.7 hrs (90 kb/hr). For short read sequencing (350 bp), 384 samples can be analyzed in 1.5 hrs.
- Shibata K., Itoh M., Aizawa K., Nagaoka S., Sasaki N., Carninci P., Konno H., Akiyama J., Nishi K., Kitsunai T., Tashiro H., Itoh M., Kikuchi N., Ishii Y., Nakamura S., Hazama M., Nishine T., Harada A., Yamamoto R., Matsumoto H., Sakaguchi S., Ikegami T., Kashiwagi K., Fujiwake S., Inoue K., Togawa Y., Izawa M., Ohara E., Watahiki M., Yoneda Y., Ishikawa T., Ozawa K., Tanaka T., Matsuura S., Okazaki Y., Muramatsu M., Inoue Y. and Hayashizaki Y., RIKEN integrated sequence analysis system (RISA system) - 384-format sequencing pipeline with 384 multi-capillary sequencer, Genome Res., 10, 1757-1771 (2000)
[References] [top]
Data Management System
We have developed many programs to analyze sequence data produced by the RISA sequencing line,including a tool for automatically classifying (clustering) cDNA clonesvalidated
based on the 3'-end sequences of 5'- end validated cDNA libraries and automatically registering sequence data in the cDNA encyclopedia database.
We have also established an assembly and primer design system, which is used to design primer sequence for primer walking during full-length sequencing of cDNA clones. The assembly system handles three kinds of sequence data produced by different types of sequencers (ABI, RISA, and LICOR) in a single format. If a gap remains, primers for the primer walking sequencing can be easily designed. Publicly available sequences, such as EST data, can be also utilized to fill the gaps in a certain number of clones.
We have a database system based on Sybase DBMS to manage most of the information derived from our sequencing system, and information about the full-length cDNA libraries such as tissue, strain, and protocols. The program manages also the attribution of clone ID, the 3' and 5'-end sequences, full-length sequences, and clustering information. This database is updated routinely.
- Konno H., Fukunishi Y., Shibata K., Itoh M., Carninci P., Sugahara Y. and Hayashizaki Y., Computer-based methods for a mouse full-length cDNA project: real-time sequence clustering for construction of a non-redundant library, Genome Res., 11, 281-289 (2001)
- Fukunishi Y., Suzuki H., Yoshino M., Konno H. and Hayashizaki Y., Prediction of human cDNA from its homologous mouse full-length cDNA and human shotgun database, FEBS Lett., 464, 129-132 (1999)
[References] [top]
Functional Annotation of Full-length cDNAs
FANTOM is the abbreviation of the "Functional Annotation of the Mouse", or the "Functional Annotation of the Mammalians" as we have produced data to analyze the human transcriptome. The objectives of our research activities have been implemented in the last few years with support of the FANTOM International Consortium organized and led by RIKEN, and in the recent year, with the support of the Genome Network Project.
In the previous FANTOM-1 and FANTOM-2 Projects we have been focusing on full-length cDNA construction, sequencing and annotation, as discussed in the FANTOM-1 and FANTOM-2 scientific meetings. Through FANTOM-1 and FANTOM-2, respectively 21,076 and 39,694 cDNA clones were sequenced and annotated, for a total of 60,770 cDNA clones. The cDNA, clones ID information, have been open to public for years at FANTOM-2 web site (http://fantom2.gsc.riken.go.jp/fantom2/doc/release/idtable2.0-2.1.txt).
In the preparation of the FANTOM 3 Project, we have modified our strategy, in order to achieve data that reveal the dynamic regulation of the transcriptome. In particular, we have prepared a novel dataset aiming at:
(1) Identification of new full-length cDNAs for a total of 103,000 clones
(2) Determination and annotation of transcripts, transcriptional units and their complexity and expression
(3) Experimental identification of transcriptional starting sites, termination sites and counting the outstanding complexity of the human transcriptome.
(4) Promoter regions identification.
Initial plan for Fantom-3 started at the end of 2002, just after the publication in Nature of the annotation of 60,770 full-length cDNA. At that time, we decided to prepare novel data to analyze the transcriptome in a more functional way. In fact, we wanted not only identify and annotate new/rarely expressed mRNA, but our ultimate purpose was to functionally annotate the transcriptome complexity, identify starting sites/termination sites of the mRNAs and their promoters. One additional aspect of understanding the complexity of the transcriptome was making sense of the noncoding RNAs, which constitute up to half of the transcriptome, and are involved in sense-antisense pairs formation.
We have been preparing novel cDNAs, for a total of 103,000 full length cDNA. These cDNAs have been annotated in a teleconference annotation called MATRICS-RELOADED, which is similar to the MATRICS annotation teleconference which took place early in the 2002 for the FANTOM-2 project. The MATRICS-RELOADED teleconference involved more than 100 scientists from all over the world.
As novel cDNAs are useful but not sufficient to provide functional annotation of the transcriptome, we prepared three completely new types of data. Among the completely novel dataset prepared, there are the cap-analysis gene expression libraries (CAGE). CAGE technology allows high throughput gene expression analysis and the profiling of transcriptional start points, including promoter usage analysis. We have prepared two additional datasets, the Gene Identification Signature (GIS) in collaboration with the Genome Institute of Singapore and the Genome Signature Cloning (GSC) (unpublished). These two technologies allow capturing tag signatures of both the 5' and 3' ends, and therefore high-throughput identification of mRNA variants. The GSC differs from the GIS for using subtracted libraries, which allow detection of rare transcripts.
After the preparation of the datasets, completed at the end of June 2004, we have been meeting twice with a consistent part of the members of the consortium. The Fantom-3 pre-meeting (Tanabata meeting) took place in July 4th- 8th, 2004 RIKEN Yokohama Institute in Yokohama, where we had the first look at the data, and decided the division of the tasks in working groups. Later, after two months of additional analysis, we had the final meeting, called The Fantom-3 Harvest Meeting, which took place on September 10th-14th 2004 at RIKEN Main Campus, in Wako-city.
During the Harvesting meeting, we have discussed the data and the vision of the transcriptome as revealed by the novel dataset, in an interactive meeting. The meeting was intentionally organized without a fixed program, in order to maximize discussions and brainstorm meetings, and a final summary of the main findings, which were then prepared for the publication.
The work of the consortium was recognized by the publication of two papers in the special RNA issues of September 2nd 2005 of Science Magazine. The articles are "The transcriptional landscape of the mammalian genome" and "Antisense transcription in the mammalian transcriptome", and provide a novel view of the complexity of the transcriptome, the concept of gene and independent transcript, and the cross-regulation of sense-antisense RNA networks.
We have also prepared new databases and genome viewers, including genome/transcriptome viewers to take into account the complexity and dynamics of the transcriptome. In particular, two databases (the CAGE basic viewer and the CAGE analysis viewer) allow storing and analyzing the CAGE data. These databases are available without restrictions and we hope that these will benefit the science of the whole community.
The genome information is a code which constitutes the blueprint of the features of human being. The genome main output is the expression RNA, among which there are the mRNAs (messenger RNAs). To devise an original approach to understand the genome function, we did not limit ourselves to simple bioinformatic comparisons with other genome sequences, but by doing the most comprehensive functional RNA mapping, based on sequencing, available for any transcriptome so far.
We have extensively applied the above technologies for the characterization of the transcriptome, and compared together the data in a polling strategy to analyze the mouse genome, and to link our finding to the human genome. We have identified 181,047 transcripts with defined transcriptional boundaries, with extensive variation in transcripts arising from alternative promoter usage, splicing and polyadenylation. All of these transcripts originate from 44,147 transcriptional units (TUs). Therefore, we have found for the first time a so extensive experimental evidence of a great variability of transcripts. It is to be noticed that more than half of these TU contain exclusively non-protein coding RNA, which promoter regions are extensively conserved. Genomic mapping of the transcriptome reveals large transcriptional forests, with overlapping transcription on both strands, separated by deserts in which few or no transcripts are observed. These data provide a comprehensive platform for the comparative analysis of mammalian transcriptional regulation in differentiation and development. Amongst new full length cDNA sequences, we have identified 16,247 new mouse protein-coding transcripts including 5,154 transcripts encoding for proteins that are very considerably different than known proteins or completely new.
In another aspect of our study, we have analyzed the overlap of the sense and antisense mRNA in the genome. In particular, the Analysis of physical cDNA clones, sequence tags and ESTs, provides compelling evidence that sense-antisense (S/AS) overlap is almost universal. S/AS are especially abundant in imprinted loci in keeping with putative roles in gene silencing. Expression analysis revealed frequent concordant S/AS pairs regulation, which argues against simple models of interactions based upon conventional RNAi phenomena. We provide experimental evidence that perturbation of an antisense RNA can alter the expression of sense mRNAs, suggesting that antisense transcription contributes to control of transcriptional outputs in mammals.
In conclusion, our "transcriptome" data represent the output of the genome function; the comprehensive nature of this dataset paves the way to reinterpretation of the function of the genome and its impact in many aspects of the biomedical research.
- Carninci P., Kasukawa T., Katayama S., Gough J., Frith M.C., Maeda N., Oyama R., Ravasi T., Lenhard B., Wells C., Kodzius R., Shimokawa K., Bajic V.B., Brenner S.E., Batalov S., Forrest A.R., Zavolan M., Davis M.J., Wilming L.G., Aidinis V., Allen J.E., Ambesi-Impiombato A., Apweiler R., Aturaliya R.N., Bailey T.L., Bansal M., Baxter L., Beisel K.W., Bersano T., Bono H., Chalk A.M., Chiu K.P., Choudhary V., Christoffels A., Clutterbuck D.R., Crowe M.L., Dalla E., Dalrymple B.P., de Bono B., Della Gatta G., di Bernardo D., Down T., Engstrom P., Fagiolini M., Faulkner G., Fletcher C.F., Fukushima T., Furuno M., Futaki S., Gariboldi M., Georgii-Hemming P., Gingeras T.R., Gojobori T., Green R.E., Gustincich S., Harbers M., Hayashi Y., Hensch T.K., Hirokawa N., Hill D., Huminiecki L., Iacono M., Ikeo K., Iwama A., Ishikawa T., Jakt M., Kanapin A., Katoh M., Kawasawa Y., Kelso J., Kitamura H., Kitano H., Kollias G., Krishnan S.P., Kruger A., Kummerfeld S.K., Kurochkin I.V., Lareau L.F., Lazarevic D., Lipovich L., Liu J., Liuni S., McWilliam S., Madan Babu M., Madera M., Marchionni L., Matsuda H., Matsuzawa S., Miki H., Mignone F., Miyake S., Morris K., Mottagui-Tabar S., Mulder N., Nakano N., Nakauchi H., Ng P., Nilsson R., Nishiguchi S., Nishikawa S., Nori F., Ohara O., Okazaki Y., Orlando V., Pang K.C., Pavan W.J., Pavesi G., Pesole G., Petrovsky N., Piazza S., Reed J., Reid J.F., Ring B.Z., Ringwald M., Rost B., Ruan Y., Salzberg S.L., Sandelin A., Schneider C., Schonbach C., Sekiguchi K., Semple C.A., Seno S., Sessa L., Sheng Y., Shibata Y., Shimada H., Shimada K., Silva D., Sinclair B., Sperling S., Stupka E., Sugiura K., Sultana R., Takenaka Y., Taki K., Tammoja K., Tan S.L., Tang S., Taylor M.S., Tegner J., Teichmann S.A., Ueda H.R., van Nimwegen E., Verardo R., Wei C.L., Yagi K., Yamanishi H., Zabarovsky E., Zhu S., Zimmer A., Hide W., Bult C., Grimmond S.M., Teasdale R.D., Liu E.T., Brusic V., Quackenbush J., Wahlestedt C., Mattick J.S., Hume D.A., Kai C., Sasaki D., Tomaru Y., Fukuda S., Kanamori-Katayama M., Suzuki M., Aoki J., Arakawa T., Iida J., Imamura K., Itoh M., Kato T., Kawaji H., Kawagashira N., Kawashima T., Kojima M., Kondo S., Konno H., Nakano K., Ninomiya N., Nishio T., Okada M., Plessy C., Shibata K., Shiraki T., Suzuki S., Tagami M., Waki K., Watahiki A., Okamura-Oho Y., Suzuki H., Kawai J., Hayashizaki Y; FANTOM Consortium; RIKEN Genome Exploration Research Group and Genome Science Group (Genome Network Project Core Group), The transcriptional landscape of the mammalian genome, Science, 309, 1559 - 63 (2005)
- Katayama S., Tomaru Y., Kasukawa T., Waki K., Nakanishi M., Nakamura M., Nishida H., Yap C.C., Suzuki M., Kawai J., Suzuki H., Carninci P., Hayashizaki Y., Wells C., Frith M.C., Ravasi T., Pang K.C., Hallinan J., Mattick J., Hume D.A., Lipovich L., Batalov S., Engstrom P.G., Mizuno Y., Faghihi M.A., Sandelin A., Chalk A.M., Mottagui-Tabar S., Liang Z., Lenhard B., Wahlestedt C; RIKEN Genome Exploration Research Group; Genome Science Group (Genome Network Project Core Group); FANTOM Consortium, Antisense transcription in the mammalian transcriptome, Science, 309, 1564 - 6 (2005)
- Kawai J., Shinagawa A., Shibata K., Yoshino M., Itoh M., Ishii Y., Arakawa T., Hara A., Fukunishi Y., Konno H., Adachi J., Fukuda S., Aizawa K., Izawa M., Nishi K., Kiyosawa H., Kondo S., Yamanaka I., Saito T., Okazaki Y., Gojobori T., Bono H., Kasukawa T., Saito R., Kadota K., Matsuda H., Ashburner M., Batalov S., Casavant T., Fleischmann W., Gaasterland T., Gissi C., King B., Kochiwa H., Kuehl P., Lewis S., Matsuo Y., Nikaido I., Pesole G., Quackenbush J., Schriml L.M., Staubli F., Suzuki R., Tomita M., Wagner L., Washio T., Sakai K., Okido T., Furuno M., Aono H., Baldarelli R., Barsh G., Blake J., Boffelli D., Bojunga N., Carninci P., de Bonaldo MF., Brownstein MJ., Bult C., Fletcher C., Fujita M., Gariboldi M., Gustincich S., Hill D., Hofmann M., Hume D.A., Kamiya M., Lee N.H., Lyons P., Marchionni L., Mashima J., Mazzarelli J., Mombaerts P., Nordone P., Ring B., Ringwald M., Rodriguez I., Sakamoto N., Sasaki H., Sato K., Schonbach C., Seya T., Shibata Y., Storch K.F., Suzuki H., Toyo-oka K., Wang K.H., Weitz C., Whittaker C., Wilming L., Wynshaw-Boris A., Yoshida K., Hasegawa Y., Kawaji H., Kohtsuki S., Hayashizaki Y.; RIKEN Genome Exploration Research Group Phase II Team and the FANTOM Consortium, Functional annotation of a full-length mouse cDNA collection, Nature, 409, 685 - 90 (2001)
- Okazaki Y., Furuno M., Kasukawa T., Adachi J., Bono H., Kondo S., Nikaido I., Osato N., Saito R., Suzuki H., Yamanaka I., Kiyosawa H., Yagi K., Tomaru Y., Hasegawa Y., Nogami A., Schonbach C., Gojobori T., Baldarelli R., Hill D.P., Bult C., Hume D.A., Quackenbush J., Schriml L.M., Kanapin A., Matsuda H., Batalov S., Beisel K.W., Blake J.A., Bradt D., Brusic V., Chothia C., Corbani L.E., Cousins S., Dalla E., Dragani T.A., Fletcher C.F., Forrest A., Frazer K.S., Gaasterland T., Gariboldi M., Gissi C., Godzik A., Gough J., Grimmond S., Gustincich S., Hirokawa N., Jackson I.J., Jarvis E.D., Kanai A., Kawaji H., Kawasawa Y., Kedzierski R.M., King B.L., Konagaya A., Kurochkin I.V., Lee Y., Lenhard B., Lyons P.A., Maglott D.R., Maltais L., Marchionni L., McKenzie L., Miki H., Nagashima T., Numata K., Okido T., Pavan W.J., Pertea G., Pesole G., Petrovsky N., Pillai R., Pontius J.U., Qi D., Ramachandran S., Ravasi T., Reed J.C., Reed D.J., Reid J., Ring B.Z., Ringwald M., Sandelin A., Schneider C., Semple C.A., Setou M., Shimada K., Sultana R., Takenaka Y., Taylor M.S., Teasdale R.D., Tomita M., Verardo R., Wagner L., Wahlestedt C., Wang Y., Watanabe Y., Wells C., Wilming L.G., Wynshaw-Boris A., Yanagisawa M., Yang I., Yang L., Yuan Z., Zavolan M., Zhu Y., Zimmer A., Carninci P., Hayatsu N., Hirozane-Kishikawa T., Konno H., Nakamura M., Sakazume N., Sato K., Shiraki T., Waki K., Kawai J., Aizawa K., Arakawa T., Fukuda S., Hara A., Hashizume W., Imotani K., Ishii Y., Itoh M., Kagawa I., Miyazaki A., Sakai K., Sasaki D., Shibata K., Shinagawa A., Yasunishi A., Yoshino M., Waterston R., Lander E.S., Rogers J., Birney E., Hayashizaki Y; FANTOM Consortium; RIKEN Genome Exploration Research Group Phase I & II Team, Analysis of the mouse transcriptome based on functional annotation of 60,770 full-length cDNAs, Nature, 420, 563 - 73 (2002)
- Genome Res., 13, 1265 - 1561 (2003)
- Shiraki T., Kondo S., Katayama S., Waki K., Kasukawa T., Kawaji H., Kodzius R., Watahiki A., Nakamura M., Arakawa T., Fukuda S., Sasaki D., Podhajska A., Harbers M., Kawai J., Carninci P. and Hayashizaki Y., Cap analysis gene expression for high-throughput analysis of transcriptional starting point and identification of promoter usage. Proc Natl Acad Sci U S A., 100, 15776-81 (2003)
- Ng P., Wei C.L., Sung W.K., Chiu K.P., Lipovich L., Ang C.C., Gupta S., Shahab A., Ridwan A., Wong C.H., Liu E.T. and Ruan Y., Gene identification signature (GIS) analysis for transcriptome characterization and genome annotation, Nat Methods, 2, 105 - 11 (2005)
[References] [top]
Full-length cDNA Microarray
19K RIKEN microarray
To analyze the function of the sequenced cDNAs, we are performing expression profiling, based on microarray experiments. We have at first established a set of clones suitable for microarray experiments containing19k of our cDNAs. A resource containing the expression profiles of 49 tissues from adult and embryos has been constructed. Expression data are analyzed and stored in our expression database, READ. These data are an exceptional resource for the functional characterization of cDNAs and gene cascades.
Construction of a high-throughput arrayer
To facilitate printing of a large number of slides, a new arrayer has been constructed, having two arms, each of which holds a pin head, and a large stage on which 96 microarrays can be prepared simultaneously. When a 16-pin head is used, 96 microarray slides containing 30,000 cDNAs each can be prepared in 100 hours. A 48-pin head device allowing faster printing yield is also available. The maximum performance is expected to be 96 microarrays of 30 k cDNAs in 16 hours.
- Miki R., Kadota K., Bono H., Mizuno Y., Tomaru Y., Carninci P., Itoh M.,
Shibata K., Kawai J., Konno H., Watanabe S., Sato K., Tokusumi Y., Kikuchi N.,
Ishii Y., Hamaguchi Y., Nishizuka I., Goto H., Nitanda H., Satomi S.,
Yoshiki A., Kusakabe M., DeRisi J.L., Eisen M.B., Iyer V.R., Brown P.O.,
Muramatsu M., Shimada H., Okazaki Y. and Hayashizaki Y.,
Delineating developmental and metabolic pathways in vivo by expression profiling using the RIKEN set of 18,816 full-length enriched mouse cDNA arrays, Proc. Natl. Acad. Sci. USA., 98, 2199-2204, 2001
[References] [top]
Protein-Protein Interaction Analysis System
To uncover the function of each gene with a systematic genome-wide approach, a protein-protein interaction (PPI) panel covering all genes, is under development. PPIs play pivotal roles towards understanding the network of cellular biological processes and in identifing potential targets for drug development. With old technologies, it would be impossible to establish a comprehensive PPI panel in mouse, because the estimated total number of mouse genes is far larger than those of budding yeast (~6,000) and C. elegans (~20,000).
To address this difficulty, we have developed a high-throughput PPI assay system that consists of a PCR-mediated sample preparation and a modified mammalian two-hybrid method. In the pilot study, this high-throughput system achieved the examination of more than 106 combinations per day. Along with our comprehensive mouse full-length cDNA clone bank covering a large number of the genes, this system will allow in short time discovery of many interactions to understand the function of uncharacterized proteins and the molecular mechanism of crucial biological processes. It will also help in preparing a draft of the entire PPI interactions in specific target cell types or tissues of the mouse.
[References] [top]
RLGS (Restriction Landmark Genomic Scanning)
RLGS MAP Homepage (The Cancer Inst. and RIKEN)
- Kamiya M., Judson H., Okazaki Y., Kusakabe M., Muramatsu M., Takada S., Takagi N., Arima T., Wake N., Kamimura K., Satomura K., Hermann R., Bonthron D.T., Hayashizaki Y., The cell cycle control gene ZAC/PLAGL1 is imprinted - a strong candidate gene for transient neonatal diabetes, Hum. Mol. Genet., 9, 453-460 (2000)
- Akiyoshi S., Kanda H., Okazaki Y., Akama T., Nomura K., Hayashizaki Y. and Kitagawa T., A genetic linkage map of the MSM Japanese wild mouse strain with restriction landmark genomic scanning (RLGS), Mammmal. Genome, 11, 356-359 (2000)
- Sugahara Y., Akiyoshi S., Okazaki Y., Hayashizaki Y. and Tanihata I., An automatic image analysis system for RLGS films, Mammal. Genome , 9, 643-651 (1998)
- Plass C., Shibata H., Kalcheva I, Mullins L., Kotelevtseva N., Mullins J., Kato R., Sasaki H., Hirotsune S., Okazaki Y., Held W.A., Hayashizaki Y. and Chapman V.M., Identification of Grf1 on mouse chromosome 9 as an imprinted gene by RLGS-M, Nature Genetics, 14, 106-109 (1996)
- Okazaki Y., Okuizumi H., Ohsumi T., Nomura O., Takada S., Kamiya M., Sasaki N., Matsuda Y., Nishimura M., Tagaya O., Muramatsu M. and Hayashizaki Y., A genetic linkage map of the syrian hamster and localization of cardiomyopathy locus on chromosome 9qa2.1-b1 using RLGS spot-mapping, Nature Genetics, 13, 87-90 (1996)
- Hayashizaki Y., Hirotsune S., Okazaki Y., Shibata H., Akasako A., Muramatsu M., Kawai J., Hirasawa T., Watanabe S., Shiroishi T., Moriwaki K., Taylor B. A., Matsuda Y., Elliott R. W., Manly K. F. and Chapman V.M., A genetic linkage map of the mouse using restriction landmark genomic scanning (RLGS), Genetics, 138, 1207-1238 (1994)
[References][top]
Transcriptional Sequencing (TS)
We developed a new sequencing system based on RNA polymerase instead of the classical DNA polymerase, called transcriptional sequencing(TS). The advantages of this system, consist of (1) the high processivity of the RNA polymerase, (2) the low quantity of required template without concentration adjustment, and (3) the isothermal amplification, which incorporates uniformly as terminators 3fdNTPs as well as fluorescent 3fdNTP dye terminator.
- Sasaki N., Izawa M., Watahiki M., Ozawa K., Tanaka
T., Yoneda Y., Matsuura S., Carninci P., Muramatsu
M., Okazaki Y. and Hayashizaki Y., Transcriptional
sequencing: A method for DNA sequencing using RNA
polymerase, Proc. Natl. Acad. Sci. USA., 95.
3455-3460 (1998)
- Izawa M., Sasaki N., Watahiki M., Ohara E., Yoneda
Y., Muramatsu M., Okazaki Y. and Hayashizaki Y.,
Recognition sites of 3'-OH group by T7 RNA
polymerase and its application to transcriptional
sequencing, J.Biol. Chem. 273, 14242-14246 (1998)
[References] [top]
|