What is the correct flow of information in gene expression

Life exists due to the emergence of molecules that store and transmit genetic information. Most of them are common to all living beings as are several of the mechanisms that intervene to allow the flow of information. Replication is one example of these mechanisms and, through it, cells obtain a semi-conserved copy of all the genetic material during the S phase of the cell cycle. This process only occurs in cells involved in cell division events. Other mechanisms are transcription, which consists of reading the genetic material fragments to produce different types of RNAs, and translation, in which RNA and ribosomes synthesize proteins. Living beings store genetic information in the form of single- or double-strand RNA or DNA nucleic acids. Therefore, the flow of genetic information between one cell and its progeny is guaranteed through replication while the flow of genetic information inside the cell is transmitted by reading the information contained in genes through transcription and translation mechanisms. These three mechanisms are strongly regulated and are activated or inactivated by signals detected and interpreted by cells, thus leading to the generation of a response with the participation of thousands of molecules that the cell signaling pathways consist of. This chapter focuses on gene expression and its regulation including transcription and translation processes. Once the concept of the gene and its structure are defined, the sequential stages of transcription, translation, and the molecules involved will be described. Moreover, the fact that gene expression is regulated during the entire process and is carried out through diverse mechanisms must be highlighted. Some examples of these mechanisms are: remodeling of molecules which serve as a template such as in the case of DNA in transcription and messenger RNA (mRNA) in translation; activation of protein factors which participate in different stages and can also be targets on signaling pathways; maturation and transportation of end products so they can acquire their functional structures and reach the sites where they carry out their function (inside or outside the cells) through the recognition of signals inscribed in the products of these mechanisms. Finally, some epigenetic properties of the genome that are important for gene expression will be discussed. These include the position of the gene in the chromosome and nucleus, the degree of chromatin condensation, and the multifractal characteristics of the genome. The latter property has recently been associated with gene regulation and genomic stability.

The human genome has three billion pairs of bases of which only 1.5% is coding: the DNA used to synthesize proteins or different types of RNA such as ribosomal RNA (rRNA), transference RNA (tRNA), and messenger RNA (mRNA). These pairs of bases are DNA fragments that form structural reading units of genetic information called genes. Moreover, the human genome contains 20,000 genes and even though the functions of every one of them are not known, they are recognized as DNA fragments with promoter and coding regions which can be transcribed. The promoter region is indispensable for transcription because it guides the positioning of the transcriptional machinery. The region transcribed contains sequences of coding nucleotides called exons, non-coding regions known as introns, and 5’UTR and 3’UTR regions with regulatory functions. The initiator element (Inr) –site where transcription begins– is found between the promoter and the transcribed region. The non-encoding 5’UTR region is found between the initiator and the first exon, which is then followed by the first intron. Thereafter, exons and introns are interspersed and after the last exon, there is a non-coding 3’UTR sequence containing the signal sequence to add the polynucleotide A tail (poly A) to the 3’end mRNA (Figure 1).

Promoter. This regulatory region has variable lengths and contains both specific and consensus sequences common to several promoters. Consensus sequences can keep genes in permanent activity and specific sequences regulate gene expression in response to different signals. These sequences are denominated proximal elements when they are close to the transcription initiation site and distal elements when they are located at distances that are greater than 1 Kb. Examples of proximal elements are TATA, CAT, SRE, CRE, AP1, and Sp1 boxes. The basic elements necessary for a promoter to be functional are the TATA box located −25bp upstream from the transcription start site, the Initiator (Inr) that includes the start site located at +1bp, and the promoter element (DPE) located +30bp downstream from the Inr. Distal elements also regulate gene expression and are located at variable distances. Some of them are: the locus controlling region (LCR) that regulates gene families expressed only during embryonic development such as homeotic genes; the isolator element, which is a sequence acting as a border between regions of transcriptionally active euchromatin or inactive heterochromatin. For instance, one isolator element that has been described in β-globin genes separates orodate receptor genes on one side from the folate receptor genes on the other. The last distal elements to mention are the enhancer and the silencer that promote or block transcription respectively. These act at long distances and in different positions and directions in relation to the gene.

The sequences mentioned above are part of the DNA elements acting in cis and are generally recognized by their association with transcription factors acting in trans. These factors locally modify the chromatin structure to assemble the transcriptional complex if they are activators, or to block the approach of this complex to the transcription initiation site if they are repressors.

At this point, it is important to define the transcription factors (TF) which are protein or lipid molecules with regions in their structure that constitute functional domains for the DNA interaction and/or other proteins and also for transcription activation or repression. Furthermore, TF recognize and bind to specific nucleotide sequences in the DNA. However, genes can have different nucleotide sequences in their promoters: a large percentage have some consensus sequences such as TATA and CAT boxes, but different elements which provide specificity to genes are found in proximal and distal regions. Therefore, alterations in the structure or function of TFs modify the gene expression which is under their regulation; even variations of a single nucleotide (SNP) in gene sequences codifying for TF and promoter elements have been associated with the development of diseases.

Gene expression is determined by reading gene information during the transcription and translation processes. The product of transcription is mRNA and the product of translation is a protein. Not all genes encode proteins which means that they can be transcribed but their mRNA is not translated. Both processes are carried out in consecutive stages making it difficult to establish the boundary between them. Moreover, each one includes multiple protein complexes in which subunits are recruited as the process advances or, instead, are released and replaced by others once they fulfill their function. Transcription, translation, and their regulatory mechanisms require a permanent recognition of molecular structures, interactions and associations between compounds, and the identification of signals that constitute true communication codes (Figure 2).

What is the correct flow of information in gene expression

Some aspects of promoter gene regions and Autoimmune Diseases.

Transcription is a process in which the fragment of genetic information with a structure or DNA template is read in order to synthesize RNA molecules. The phases of transcription are: pre-initiation, initiation, promoter clearance, elongation, and termination. All of them include the participation of protein complexes the components of which are assembled and disassembled as the process advances. Most of the mechanisms which regulate the flow of genetic information act in the period prior to the reading process, where signals and molecules must come together to assemble the pre-initiation complexes that will be modified to form the initiation complexes. Furthermore, different factors facilitate, review the process, and keep the machinery active during elongation and, during termination, spatial structures or cis sequences are recognized to stop the reading machinery. There are three types of genes which codify RNAs or proteins. Type-I genes encode rRNA (28S, 18S, and 5,8S) and are transcribed by RNAP I holoenzyme. Type-II genes encode mRNA, hnRNA, snRNA, miRNA, and telRNA and are transcribed by RNAP II. Finally, Type-III genes encode tRNA, snRNA, and 5S rRNA and are transcribed by RNAP III holoenzyme. The three RNAP complexes consist of different subunits and are accompanied by general transcription factors specific to each complex. In all of these types of genes, the promoter is the axis where protein complexes arrive, assemble, and also accompany RNAP during the beginning of transcription. RNAP provides the site of assembly for the complexes which are in charge of the continuity of the process and of modifying the growing RNA molecule. Nevertheless, RNA modification is different for each type of RNA. For example, to obtain mature mRNA (the template used in protein synthesis), co-transcriptional modifications such as capping, splicing, cleavage, and poly (A) addition must be made.

Transcription details of type II genes transcribed by RNAP II are the ones that are best understood so far. Even though some miRNA can also be transcribed by RNAP III, they have promoters with structures that are different from those described above. Note that RNAP II is a holoenzyme made up of 60 polypeptides with an approximate weight of 3-million daltons (MDa). It is organized in 12 subunits denominated Rpb1-Rpb12 and a core that consists of Rpb1, Rpb2, Rpb3, Rpb6, and Rpb11. Moreover, the carboxyl terminal end (CTD) of the Rpb1 subunit has a domain of 7 amino acids (Tyr1, Ser2, Pro3, Thr4, Ser5, Pro6, Ser7) YSPTSPS which is repeated 22 times in yeasts, 44 in mice, and 55 in human cells (Figure 3). However, the sequence can have subtle differences between repeated segments and ends in a 10 amino acid sequence. Therefore, CTD is an interaction site for multiple factors including factors that promote chromatin remodeling.

Other functions of CTD are: guarantee the processing of the RNAP complex, avoid early termination, modify the new RNA co-transcriptionally, and promote elongation. The CTD’s binding site to different complexes is called a “docking” domain and the large combination of complexes that binds to this domain constitutes the CTD code, which determines the recruitment and release of proteins at the appropriate moment during transcription. Furthermore, during a transcription cycle, the CTD structure is modified by phosphorylation and dephosphorylation on both Ser2 and Ser5. However, it must remain hypophosphorylated during initiation and hyperphosphorylated during promoter clearance and elongation, but it must be dephosphorylated in order to be recycled.

The CTD domain can be phosphorylated at different transcription stages by three main kinases: Cdk7, Cdk8, and Cdk9 and is dephosphorylated by phosphatases. The Cdk7/Kin28 complex is a TFIIH subunit that phosphorylates the CTD on Ser5, which is essential to recruit Set1 and enzymes participating in cap formation and promoter clearance regulation. Cdk8 is associated with the Srb/mediator, which is a complex that connects enhancers and silencers with RNAP II before elongation. Additionally, Cdk9 and cyclin T are subunits of the positive transcription elongation factor b (P-TEFb). Cyclin T recognizes a sequence rich in histidine on CTD. On one hand, phosphorylation on Ser2 by Cdk7, Cdk8, and Cdk9 favors the recruitment of factors facilitating continuity of transcription and splicing. On the other, dephosphorylation of the CTD domain is mediated by Ser2 specific phosphatase (FCP1) and Ser5-selective phosphatase (SCP1) with the participation of TFIIH. FCP1 is a factor that is also involved in the modulation of transcriptional pauses and RNA processing. It interacts with the DRB-sensitivity-inducing factor (DSIF) and RNA polymerase-associated factor 1 (PAF1) and is inhibited by proteins from the capping complex (Figure 3).

The first phase of transcription is pre-initiation, which consists of formation of the pre-initiation complex (PIC) on the promoter of the gene. PIC is composed of general and specific transcription factors, coactivators, and the RNAP complex. Its assembly is stimulated when an activator binds to one of the promoter sequences thus generating a spatial structure recognizable to coactivators located in the same region. This region also recruits ATP-dependent complexes to produce chromatin relaxation.

Therefore, these changes provide the space necessary for the arrival of basal factors including TFIIA, TFIIB, TFIID, TFIIE, TFIIF, TFIIH, and the RNAP II complex. PIC formation is nucleated by the TATA binding protein (TBP), which is a TFIID subunit. In the absence of the TATA box, another TFIID subunit can bind to another element of the promoter and support PIC formation. Afterwards, TFIIB binds to and surrounds the DNA-TBP complex, and its N-terminal domain extends to the transcription start site and provides support for the binding of RNAP II. Moreover, TFIIF and RNAP II bind to the promoter guided by TFIIB and, at the same time, TFIIE joins the complex and acts as a signal to recruit TFIIH. At this point, PIC is assembled on the promoter core (Figure 4).

TFIID is composed of TBP and 14 TBP associated factors (TAF). Some TAF subunits have acetyltransferase activity or are associated with chromatin remodeling complexes such as Spt-Ada-Gcn5-Acetyltransferase (SAGA). TAF1 and TAF2 bind to the Inr element, while TAF6 and TAF9 bind to DPE. Additionally, TFIIH consists of 10 subunits and has helicase activity on the XPB (Xeroderma pigmentosum factor D, also known as protein ERCC2) subunit. Furthermore, XPB is a 3’-5’ DNA-dependent helicase required for transcription and DNA repair through nucleotide excision (NER) and belongs to Superfamily II which consists of ATP-dependent helicases containing iron-sulphur cluster domains. Therefore, TFIIH facilitates promoter melting during initiation and regulates the transition from initiation to elongation by Cdk7-catalysed phosphorylation of RNAP II CTD in its Ser5 residue. The presence of mediator complexes during PIC formation has been described in some promoters. The mediator complex, which is composed of at least 24 subunits and has a mass greater than 1 MDa, acts as a general transcription factor. It integrates regulating signals from enhancers and silencers, associates with the hypophosphorylated RNAP II complex, and stimulates the TFIIH kinase activity.

Initiation occurs after PIC formation when the reading machinery is assembled and two initiating nucleoside triphosphates (NTPs) are linked according to the DNA sequence and, thus, lead to the formation of the first phosphodiester bond. When RNAP II is assembled on the promoter, the TFIIH Cdk7 subunit phosphorylates Ser5 on CTD and induces the activation of the XPB helicase function. Activation of XPB produces the single DNA strand that stimulates RNAP II polymerase activity which marks the beginning of transcription. Afterwards, CTD is phosphorylated in Ser2 by the CDK9 subunit of P-TEFb which is associated with cyclin T1, T2a, and T2b or cyclin K. Phosphorilation of CTD by P-TEFb leads to promoter clearance.

After the bonding of 20 to 40 ribonucleotides, CTD is dephosphorylated on Ser5 by Rtr1 and Ssu72 phosphatases. In contrast, it is phosphorylated on Ser2 by CDK9. These changes produce the release of general transcription factors from the RNAP II complex which will remain on the promoter thus serving as a scaffold for the formation of the other RNAP II Initiation Complex (IC) (figure 5).

It has been noted that some Ser5 must remain phosphorylated along with Ser2 to allow promoter clearance, which consists of the movement of the transcriptional machinery away from the promoter. The stability of the complex increases when TFIIB is released and TFIIF remains in it thus producing a pause or termination of the process.

Elongation is a very important step for the integration and coordination of multiple events such as RNA synthesis which guarantees the process of the reading, inhibition of premature transcription termination, and prevention of errors during the process. Other important events are related to RNA maturation. The binding of other proteins to the RNAP II complex generates the transcription elongation complex (TEC) (figure 6), which is composed of two protein families classified as active (TFIIF, Elongins, ELL, DSIF, NELF, CSB, FCP1, TFIIS, Spt6, 19S Proteasome) or passive proteins (P-TEFb, Ssu72, SWI/SNF, Isw1p, CHD1, FACT, Set1, Set 2, Paf, tho, TREX, and Iws1/Spn1) on the basis of their action on RNAP II catalytic activity. Over 23 elongation complexes are in charge of preventing RNAPII from pausing or stopping after the transcription process has started. The activity of TFIIF, an active protein, is regulated through phosphorylation by TFIID and TFIIH subunits. TFIIF lessens pausing and stimulates the rate of RNAP II. At the same time, TFIIH contributes to stabilizing the RNAP II complex, releasing TEC from the promoter, and keeping TEC close to the promoter. Other active proteins belong to the Elongin SIII complex, which consist of subunits A, B, and C, prevents RNAP II from being paused or stopped, and promotes its advance along the DNA. Subunits B and C also participate in the ubiquitination-dependent mechanism and are able to stop the transcription process when DNA mutations are detected. The eleven-nineteen lysine-rich leukemia (ELL) complex belongs to a three-member family composed of ELL1, ELL2, and ELL3 and acts as an antitermination factor during the elongation. Another active protein is DSIF, a heterodimeric complex with Spt1 and Spt5 subunits that binds directly to RNAP II and pauses transcription in the presence of 5, 6-dichloro-1-β-D-ribofuranosylbenzimidazole. It also acts along with NELF to prevent a premature termination and interacts with several factors such as TFIIF, TFIIS, CSB, chromatin modeling factors including Spt6, FACT, CHD1, and the PAF complex. Subunit Spt5 also interacts with mRNA stability and maturity factors. It also favors continuation of elongation through processes of methylation in Arginine residues and in vitro phosphorylation by PRMT1 and PRMT5. As a result, DSIF stimulates elongation, suppresses early termination of the transcription process, and stimulates capping. As was mentioned before, NELF acts as an active protein complex which consists of NELF-A, NELF-B, NELF-C, NELF-D, and NELF-E. It binds to the DSIF/RNAP II complex and halts transcription to allow the assembly of factors participating in 5’- capping, maturation and transport of mRNA. P-TEFb, in turn, accompanies RNAP II during its displacement along the gene, lessens the pause mediated by NELF through phosphorylation of Ser2-CDT and DSIF (Spt5) thus allowing dissociation of the NELF complex and enabling the elongation process to continue.

The protein CSB is a DNA dependent ATPase which binds directly to the RNAP II complex and modulates TFIIS activity. It also participates in transcription-coupled repair (TCR) and nucleotide excision repair (NER) systems. Furthermore, it has been found that the NER system is defective in patients with premature aging syndrome or Cockayne Syndrome (CS) in whom CSB mutations have been identified. Finally, the TFIIS complex restores the transcription process after a pause, improves the efficiency of the process, and increases the rate of synthesis after it has been reduced during the pause. It is physically associated with Spt5 and contributes to the “proofreading” activity of the RNAP II complex.

What is the correct flow of information in gene expression

Gene transcription elongation step regarding autoimmune phenotype.

Protein complexes approach the primary transcript for maturing process, achieve the functional structure and regulate its permanence inside the cell and its export outside the nucleus as elongation progresses in a co-transcriptional manner. Modifications vary among types of RNA (e.g., modifications of mRNA include capping, addition of polynucleotides of Adenine to form the poly A tail, 3’ cleavage, and splicing) and always follow this order. CTD is the platform on which most of these events are coordinated and its phosphorylation status determines the structural conformations promoting association or dissociation of complexes as needed.

In yeasts, a model with 26 repetitions of the CTD heptapeptide revealed the formation of a left-handed beta spiral structure which is stabilized by phosphorylation on Ser2 and is not compatible with phosphorylation on Ser5. This structure favors the assembly of pre-mRNA modifying complexes.

Pre-mRNA is the raw material for the maturation process where it is transformed into a functional mRNA for translation. The capping reaction occurs after the newly synthesized pre-mRNA has a length of approximately 20–30 bases. At that point, RNAP II pauses, and three proteins approach the 5’ end of the pre-mRNA. The first protein is the 5 –triphosphatase that converts the 5’ triphosphate nucleotide into diphosphate. This is followed by the fusion of one GMP catalyzed by guanylyltransferase. Afterwards, a methyltransferase adds a methyl group to the N7 of GMP while the first and sometimes second nucleotides of the primary transcript are methylated in their 2’-hydroxyl groups to complete the capping structure. The capping process depends on the presence of Ser5 phosphorylated CDT and on the interaction between capping enzymes and DSIF. There is a checkpoint that verifies the capping mechanism’s fidelity and contributes to RNAP II stability. The capped end of the mRNA is protected from exonucleases and is recognized by specific proteins belonging to the splicing, transport, and translational machineries. Once the capping structure has been synthesized, the FCP1 enzyme helps to remove the enzymes participating in its assembly.

Cleavage occurs on dinucleotide GU at the 3’ end of pre-mRNA. Pre-mRNA is released from TEC and followed by the addition of a variable length fragment called poly A tail. Meanwhile, RNAP II continues transcribing several kilo bases ahead of the GU, thus forming an RNA strand that will be degraded. Poly A polymerase builds the poly A tail (about 200–300 nucleotides long) and the poly A binding protein (PAB) catalyzes its union with the 3’ end of mRNA, which is guided by the recognition of a AAUAAA signal present in the primary transcript (approximately 10–30 bp away from DNA dinucleotide CA) where the cleavage takes place. Cleavage factors CFI and CFII cut the 3’ end of pre-mRNA, which requires the localization of the cleavage poly adenylation stimulating factor (CPSF) in the AAUAAA pre-mRNA sequence and the binding of cleavage stimulation factor (CstF) to a pre-mRNA sequence rich in GU. Note that CPSF is a TFIID subunit recruited during PIC formation which shows a connection between transcription initiation and termination processes.

Splicing consists of the selective elimination of fragments (primarily introns) from pre-mRNA, which is an essential step in the expression of multiple genes. The cell machinery that splices pre-mRNAs uses information at the splice junctions to determine where to cut and where to rejoin the mRNA. The elimination of exons, through which different mRNAs are obtained and are translated into protein isoforms, is less common and is called alternative splicing. Exons are relatively short sequences (typically, 50–250 base pairs in length) separated by larger sequences called introns (typically, hundreds to thousands or more base pairs in length), which account for approximately >90% of the primary transcript. The complex responsible for this process is called spliceosome and it consists of more than 100 core proteins and five small nuclear RNAs (snRNA) (U1, U2, U4, U5, and U6). The spliceosome is recruited in the presence of hyperphosphorylated CTD and requires additional regulatory proteins to perform its functions. Some of these regulatory proteins have ATP- dependent helicase activity and others belong to the serine/arginine-rich (SR) family of proteins which bind to the ESE element next to the 3’ end exon. Moreover, core splicing signals are three sites in every intron: the 5’ splice site (5’ss: 5’A/C AG - GU A/C AGU 3’), the 3’ splice site with the polypyrimidine (Y) sequence (3’ss: 5’YYYYYYYYYYYYN C/U AG-G G/U3’), and the branch point sequence (BPS).

The process starts when U1 binds to the 5’ss (U1 bears a complementary sequence to 5’ss) and is followed by the binding of U6. At the same time, U2AF joins the 3’ss sequence and SF1/mBBP is recruited in BPS. Later, U2 is associated with BPS- SF1/mBBP complex (Figure 7). After this, U2AF recruits U2 to the intron’s central region, ESE, and it, in turn, associates with SR and constitutes the U4, U6, and U5 binding platform. At this point, U1 and U4 abandon the complex, followed by the formation of a loop and activation of U6 ribozyme function which cuts the 5’ and 3’ ends of the intron, thus producing the release of the intron in order to bind the exons. During the intron’s cutting process, U5 keeps the exons together. Finally, the exons are ligated and the spliceosome is disassembled. Splicing is regulated by cis-elements (ESE, ESS, ISS, and ISE) and trans-acting factors (SR proteins, hnRNP, and unknown factors), and occurs with very high accuracy.

Termination leads to RNA release and disassembly of the reading machinery. RNAP I and RNAP II have a common termination factor: TTF2 which is independent from CTD and has a mechanism which is not well understood. TTF2 is also associated with RNAP II surveillance activity and nuclear export.

Once the mRNA is synthesized and checked, the exosome complex with 3’-5’ exorribonuclease activity eliminates the mRNA carrying alterations. An exportin complex of the mRNA called transcription/export (TREX) participates in mRNA transport to the cytoplasm and it is also associated with elongation.

What is the correct flow of information in gene expression

Alternative splicing and its relationship with Autoimmune Disease.

Translation is the process in which the information contained in the genes is transformed into proteins. The information expressed in the language of nucleotides (nucleic acids) is translated into the language of amino acids (proteins). In eukaryote cells, translation takes place in the cytoplasm, hence implying an additional regulation point in comparison to prokaryote cells because, in the case of eukaryote cells, it is necessary to export the molecules which will be used in protein synthesis such as mRNA, rRNA, and tRNA from the nucleus to the cytoplasm. The central organelle of translation is the ribosome, which has enzymatic properties that guarantee a balance between precision and rate of protein synthesis. Ribosomes have two main functions: facilitate the correct interpretation of the genetic code and form the peptide bonds. The ribosome is the site where information contained in the mRNA is read and interpreted according to the rule established by the genetic code in which three ribonucleotides are equivalent to one amino acid. Inside the ribosomes, each amino acid provided by tRNAs is assembled and the peptide bond is formed. All the elements required during translation must be present in the cytoplasm including: tRNAs carrying amino acids, ribosomal subunits, the mRNA template, and translation factors. Prior to ribosome assembly, the formation of the pre-initiation complex (PIC) upstream from the start codon on the mRNA’s 5UTR region (figure 8) and the complex in charge of finding the AUG initiation codon which determines the open reading frame (ORF) is necessary. Afterwards, the larger 60S ribosomal subunit is added to the structure, thus forming the 80S ribosome and marking the beginning of translation. During translation and the ribosome moves along the mRNA while the protein is synthesized until it identifies the stop codon which produces the disassembly of the ribosomal units and the release of the protein.

The ribosome has three interaction sites: the first one, which is called A site, is where the aminoacyl tRNA enters and the codon-anticodon is recognized, the next is the peptidyl site (P site) where the peptide bond is catalyzed, and the last one is where the deacetyl tRNA without amino acid is before it leaves the ribosome (E site).

Prokaryote cells have a specific sequence that is 3 to 9 nucleotides in length (UAAGGAGG) in 5’UTR mRNA called Shine–Dalgarno, which is complementary to the 3’ end of the ribosome’s minor 16S rRNA subunit. Their interaction results in the formation of a duplex and the bond between the ribosome’s minor subunit and the mRNA. Thereafter, the duplex is destabilized and the subunit is displaced to where it finds the start codon. A similar sequence has not been described in eukaryotes. Rather, eukaryotic initiation factors (eIFs) have been found to facilitate the bond of the minor ribosomal subunit (40S) and the pre-initiation complex slides to the start codon. However, it has been reported that some mRNAs carry internal sequences called internal ribosome entry sites (IRESs) which are in charge of recruiting PIC to the start codon, thus starting a capping-independent translation. Some of them correspond to viral mRNA. In eukaryotes, PIC weighs 48S. It is initially located on the mRNA’s 5’- UTR capped region and moves along the mRNA through a scanning mechanism until it reaches the start codon (AUG). The assembly of this complex is guided by the eIF4F, eIF4A, and eIF4B factors present in the capping complex while the protein binding poly A (PABP) guides the approach of the 3’end to the same capping complex. This leads to the formation of a loop and the closure of the 5’and 3’ ends. PABP interacts with eIF4G and promotes mRNA circling to form a ‘‘closed loop’’ which facilitates the reinitiation of translation by the ribosomal units released during the termination of a cycle (Figure 8a). Moreover, the eIF4F factor is comprised of eIF4A (a helicase, the activity of which is enhanced by eIF4B), eIF4E (cap-binding protein), and eIF4G (a scaffold for eIF4E and eIF4A that also binds eIF3). The factors, eIFs 4A, 4B, and 4F, relax the region proximal to the capping and promote the bonding of a 43S complex in order to form PIC 48S. Prior to this, in order to form 43S complexes, the multifactor complexes (MFCs) and ternary complexes (TCs) are organized. Unlike capping, the formation of TCi (TC containing the metionil-tRNA, the first residue in all eukariotic proteins) occurs through the association between eIF2 and GTP-Met-tRNAi, which is always the first to be located within the ribosome and to carry the initiator methionine (Figure 8b). Other GTP-aminoacyl-tRNAs are bound to the elongation factor in eukaryotes (eEF1A) to form other TCs. Subsequently, eIF1, eIF3 and eIF5 bind to TC in order to form the MFC (figure 8c). Meanwhile, eIF5 binds to the 40S ribosomal subunit and recruits it to the multifactor complex (MFC). As a result, the 43S complex is formed (figure 8d). Finally, the 43S complex is recruited into 5’UTR next to the capping, thus completing PIC 48S (figure 8e). Afterwards, PIC 48S moves along the mRNA in direction 5’ to 3’ to where it finds the start codon flanked by the GCC(A/G) CCAUGG sequence in eukaryotes (Figure 8f).

A perfect match between the start codon and its Met-tRNAi anticodon produces hydrolysis of the GTP in the eIF2-GTPMet-tRNAi and, at the same time, eIF5B mediates the bond of the 60S ribosomal subunit. This process leads to the assembly of the functional 80S ribosome or initiation complex (IC) and the release of eIF2-GDP+Pi, eIF1, eIF3, and eIF5 factors (Figure 8g). Some mRNAs have a secondary structure on the 5’UTR region and require ATP helicase activity to increase the bond of the 43S complex. During the assembly of ribosomes, the Met-tRNAi remains on the P site and the A site is available to receive the aminoacyl-tRNA which carries the next amino acid. In addition, the eIF1A factor binds to the minor subunit at the beginning and participates in the recruitment of TC. It also accompanies the 48S complex during scanning and temporarily occupies the A site in the ribosome.

Once the ribosome is assembled, a second GTP-aminoacyl-tRNA reaches the A site and the peptide bond is formed in the P site between the first two amino acids, which are catalyzed by peptidyl transferase. The first tRNA without the amino acid remains located on the E site in order to exit the ribosome during the next movement. Furthermore, elongation consists of synchronized movements of the ribosome and tRNAs, through which tRNAs are shuttled through A, P, and E sites until they exit the ribosome. The ribosome selects the amino acid that must enter from a pool of 40 different aminoacyl-tRNAs and the correct codon-anticodon pairing induces changes in the ribosomal structure which contribute to its subsequent displacement on the mRNA. TCs are accompanied by eEF1A and their entry into the ribosome is facilitated by the ribosome GTPase activity. GTP hydrolysis causes a dramatic change in eEF1A structure which decreases its affinity for aminoacyl-tRNA and leads to its dissociation from aminoacyl-tRNA, thus allowing the accommodation of the aminoacyl tRNA on the A site and the synthesis of the peptide bond. The eEF2 catalyzes a rapid and coordinated movement which leads to the translocation of the tRNA from the A site to the P site. Several antibiotics such as amino glycosides, macrolides, and tetracyclines target the prokaryotic ribosome and alter the elongation rate and accuracy.

What is the correct flow of information in gene expression

Post-transcriptional regulation in Autoimmune Diseases.

This takes place when the stop codon is recognized by the ribosome, thus generating a modification of the ribosome structure which halts translation, induces peptide release, and disassembly of the ribosome’s components and elongation factors to start another translation cycle. There are eRF1 and eRF3 factors which participate in the termination process and it is unknown if GTP hydrolysis will be required at this point. In addition, evidence suggests that the interaction between PABP, eRF1 and eRF3 factors stimulates the activity of IC and the bond with the 60S subunit.

Genes can have two states depending on whether their information is being read or not. Moreover, a gene is said to be active when it is transcribed and inactive when it is not; hence, the product of the transcription of some genes can be translated until the protein sequence is complete. Furthermore, it was observed that some genes are continually expressed while there are others with spatiotemporally regulated expression. This leads us to think about the presence of gene expression regulation systems. The first studies related to gene expression regulation were focused on the mechanisms of transcriptional activation and positive control, whereas transcriptional repression and negative control was less studied. Thus, the promoter was the center of several studies that concentrated on cis proximal and distal sequences and also on transcription factors which recognize promoters and have a transacting action. However, in recent years, studies have shown that transcriptional repression mediated by chromatin structure and repressor proteins is a common regulatory mechanism in mammals. This mechanism has generated new approaches to understanding gene activation as a relief of repression by the nucleosomal structure of the chromatin.

These mechanisms are at a different level from the gene itself. They are called epigenetic regulation and include chromatin remodeling through changes in DNA and histones as well as regulation by miRNAs and asRNA. Furthermore, it was observed that the amount of protein far exceeds the number of genes described. All these facts suggest that there is a very efficient regulation system in gene expression within cells and also strategies to generate a diversity of proteins. Regarding the variety of proteins, alternative splicing has been postulated as the mechanism generating the majority of mRNA diversity from one gene. However, other mechanisms include the beginning of an alternative transcription. For example, the microphtalmia factor gene (MITF) has several transcription initiation sites located between the TATA box and the first exon, thus generating different mRNA isoforms which are translated. Other mechanisms are alternative polyadenylation, gene fusion, and trans-splicing. These processes are not usually in superior eukaryotes. Hence, gene expression regulation contemplates all the previously mentioned events and these events are integrated into the following topics: events associated with RNA and protein molecules, epigenetic mechanisms, and signal transduction pathways. They can act at different moments during transcription or translation depending on how the process is analyzed.

The abundance, location, and activity of RNA and proteins must be regulated to make the cell function well. The abundance is regulated through a balance between synthesis and degradation rates and the mean lifetime of molecules. The location depends on the efficiency and reliability of the transportation systems which work mainly by recognizing signals present in the molecules which need to be transported. This activity is regulated by the acquisition of the functional structure through protein folding, modifications after the synthesis process, and the interaction and association with other compounds.

The RNA synthesis rate depends on what the cell needs. The highest rate belongs to rRNA. Synthesis of this takes place in the nucleolus which is a region in the nucleus enriched with proteins which allow the transcriptional machinery to be assembled. This rate of synthesis is regulated by the signals present in the promoter, the availability of transcription factors, the epigenetic program, the chromatin structure, and the position of the gene in the nucleus and the chromosome. In addition, the presence of the proximal (TATA, SRE, CRE, etc.) or distal (enhancer, isolator, silencer, RCL) cis elements is specific for each gene. These elements are susceptible to being recognized by transcription factors. Transcription factors are proteins which bind to the promoter prior to the RNAP II. They bind to the promoter by recognizing specific elements and are classified into general and specific factors. General transcription factors are those present in almost all types of cells while the expression of the specific factors is restricted in both time and cell type.

Transcription factors have at least three functional domains in their structure: for DNA binding, for protein-protein interaction, and for transcription activation (TAD) or repression (TRD). All of them can be the activation target of signal transduction pathways which modify their structure through post-translational processes such as phosphorylation and dephosphorylation. The DNA-binding domains have tridimensional structures such as the zinc finger, the leucine zipper, the helix-turn-helix (HTH), and the helix-loop-helix (HLH) structures. Their association is weak and occurs through 20 contact points by the recognition of short and specific sequences with high DNA-Protein structural complementarity, which provides stability and specificity. The protein-protein interaction domains, in turn, are important for the dimers and complex formation. An example of this is the SRF-SRF complex. This dimer binds to the serum response element (SRE) and recruits other proteins in order to form complexes of higher molecular weight that consist of subunits (PIC, IC and RNAP II). Activation domains have rich regions in glutamines and prolines, whereas repression domains have lysines. The transcription factors regulate the initiation transcription rate by facilitating the entry of RNAP complexes, which may stay in order to keep the promoters permanently open and have a higher transcription rate. This is the case for rRNA genes.

What is the correct flow of information in gene expression

Transcription Factors related to Autoimmune phenotype.

During translation, microRNAs can bind to the 3’UTR and block the process when the ‘‘closed loop’’ is formed via PABP. However, this structure also facilitates PIC and IC assembly and the reading of the same mRNA strand by several ribosomes at the same time (polyribosomes). Additionally, the poly A tail blocks the RNA helicase activity of eIF5B, which stimulates the binding of the tail to the ribosome’s major subunit and the release of the IC ribosome. The presence of IRES sequences on the mRNA of certain viruses such as Hepatitis C virus (HCV) promotes the assembly of ribosomes and the beginning of 5’cap-independent translation as well as increasing the efficiency of the process.

Maturation processes of primary transcripts are different for each type of RNA. Even though some of the mRNA modification has been mentioned, we will emphasize splicing regulation and the modifications of other RNA types. Alternative splicing of pre-mRNA is a major contributor to proteomic diversity and regulation of gene expression. Diversity is acquired by generating mRNA isoforms containing different combinations of exons which will be translated into multiple protein isoforms. Splicing is tightly regulated in different tissues and developmental stages, and its disruption can lead to human diseases. Several studies are focused on determining a set of rules or ‘‘code’’ to predict the splicing pattern for any primary transcript, and they have already shown a larger network of pathways to regulate the splicing process. For example, a trans-splicing event between distal genes has been described, and it results from the fusion between primary transcripts leading to the combination of exons from different genes.

Nitrogenated bases on tRNA, in turn, undergo covalent modifications which lead to the formation of loops and double-chain fragments, the acquisition of a trefoil shape structure, and the addition of an amino acid to its 3’ end. In contrast, the rRNA undergoes several cleavages in its primary transcript to obtain the 5S, 5.8S, and 28S rRNAs. This is followed by the addition of 49 ribonucleoproteins to form the 60S subunit, and the binding of 33 ribonucleoproteins with the 18S rRNA to form the 40S subunit. Finally, proteins, which are the final product of translation, are covalently modified by transferring phosphate, methyl, acetyl, glycosyl, and ubiquitin groups, etc., to its structure. Several enzymes participate in this dynamic and reversible process. For example, kinases transfer phosphate groups, while phosphatases eliminate them. The combination of these modifications regulates protein activity, stability, and degradation. Moreover, most of these modifications are also regulated by components of the signaling pathways.

RNAs, in eukaryotes, must be transported to the cytoplasm where they carry out their function in protein synthesis. Regions of the RNA are exposed and recognized by adapter proteins such as Nab2, which interacts with Mex67 to form a complex recognized by the Mex67-Mtr2 receptor (TAP-p15 or NXF1-Nxt1 in higher eukaryotes) (Figure 9). Then, it is guided toward the nuclear pore complex (NPC), where it will be transported to the cytoplasm through the FG-nucleoporins lining the central channel. Hence, protein transportation is regulated by the exposure of peptide signals present in the structure. For example, proteins from the nucleus have an NLS signal (Pro, Pro, Lys, Lys Lys, Arg, Lys, Val) recognized by nuclear importins, which mediate transportation through the nuclear pore. In contrast, proteins from the peroxisome have the Ser, Lys, Leu peptides. Additionally, proteins synthesized in free ribosomes are transported to the mitochondria, nucleus, peroxisome or stay in the cytoplasm. This depends on the peptide signals in their structure recognized by proteins named chaperons belonging to transportation systems which ensure the correct location of the proteins. Ribosomes that adhere to the rough endoplasmic reticulum (RER) synthesize proteins which remain in the membrane or in the RER matrix. These proteins are then located in the plasma membranes, in lysosomes, or in transportation vesicles for storage or secretion. Note that chaperons mediate transportation systems to guarantee correct localization of the proteins.

What is the correct flow of information in gene expression

Post-translational modifications of autoantigens in autoimmunity.

What is the correct flow of information in gene expression

Cell traffic of autoimmune mediators.

Some mRNAs and proteins have high cellular permanence and others are rapidly degraded. The mRNAs with a long poly A tail have greater permanence and, generally, are degraded by exoribonucleases. Studies in Xenopus oocytes have found that the length of the poly A tail depends on the balance between the cytoplasmic synthesis mediated by GLD2 polymerase and the degradation carried out by PRNA ribonuclease. Note that, in addition to other ways, proteins can be degraded in the lysosome or by ubi-quitination signals recognized by the proteasome or by autophagy. Almost all require the recognition of degradation signals which act as markers on the proteins. For example, histone degradation is mainly regulated by the ubiquitination system through ubiquitin peptides that can be attached in their PEST domain for recognition and degradation by proteasomes. The proteasome is made up of a 19S regulatory subunit and a 20S subunit with chymotryptic, tryptic, and caspase type proteolytic activity. The 19S subunit also participates in transcription regulation by providing energy for the complex formation composed of SAGA acetyltransferase and the transcription initiation factor. The energy is supplied by subunits containing six AAA-type (Rpt1-6) ATPases and two non-ATPases (Rpn1-2). Subunit 19S also interacts with a complex that is named facilitates chromatin transcription complex (FACT) to reorganize histones during elongation in the transcription process.

Epigenetic regulation of gene expression is mediated by mechanisms which modify access to the region containing the promoter or the start codon which is not related to cis signal sequences. This regulation is associated with DNA methylation, histone modification, chromatin remodeling, formation of double-strand structures between ncRNA complementary regions, and genome organization. The DNA is packed around nucleosomal units which are made up of octamers of histone proteins named H2A, H2B, H3, and H4. These are separated by an H1 molecule. For transcription, the DNA must be accessible, which means that the chromatin must be in a relaxed state. Modulation of the condensation state is done through protein complexes which modify the amino terminus end of the histones, thus making them release DNA fragments locally. Nucleosomes are not completely removed during transcription. As a result, the relaxation of the structure allows the transcriptional complex to circle the nucleosome as it reads the DNA template. Although epigenetic mechanisms will be discussed thoroughly in the chapter on epigenetics and autoimmune diseases, we will describe the main epigenetic mechanisms in the context of this section.

What is the correct flow of information in gene expression

mRNA TNF-α regulation and stability.

DNA methylation occurs on CpG dinucleotides located on the promoters to prevent their recognition by transcriptional factors and inhibit the transcription process. Methylation of miRNA gene promoters has been frequently observed in tumor cells, e.g., the miR-124a promoter is hypermethylated in colon cancer cell lines but not in normal tissue.

What is the correct flow of information in gene expression

Chromatin remodeling and autoimmunity.

What is the correct flow of information in gene expression

DNA methylation and Autoimmune Diseases.

Histones can be the object of covalent modifications such as methylation, acetylation, ubiquitination, phosphorylation, isomerization in proline, ADP ribosilation, and citrulination. They can also be degraded due to the action of proteolytic pathways. Altogether, these changes constitute the so-called “histone- code,” which acts as a signal for recognition and interaction with regulatory proteins and epigenetic modifiers. These modifiers may have specific interaction domains, for example, a bromo-domain recognizes acetylated lysine. Moreover, acetylated histones in lysine residues destabilize internucleosomal interactions and the bond between histones and DNA. Note that these modifications are reversible. For example, lysine acetylation mediated by acetyltransferases (HATs) is associated with gene activation and its action can be reversed by histone deacetylase (HDACs) proteins, which eliminate lysine acetyl groups. Histone hypoacetylation is associated with gene repression.

Chromatin remodeling consists of the modification of chromatin condensation states affecting DNA accessibility, especially in the promoter regions. This remodeling is the product of interaction between remodeling complexes such as SWI/ PAF and the combination of DNA methylation with histone methylation and acetylation. Furthermore, two chromatin regions can be distinguished: a more condensed heterochromatin that is transcriptionally inactive and a less condensed euchromatin with transcriptional activity. The HP1 protein has a chromo-domain which binds to H3K9me. Both are heterochromatin markers in most eukaryotes and act as a dynamic platform to recruit factors in processes such as cell-type switching, sister chromatid cohesion, RNAi, and transcriptional gene silencing.

Homothallic switching deficient/sucrose nonfermenting (SWI/SNF) complex is an ATP-dependent complex recruited by activator molecules on gene promoters and it also binds to TFIIS. It acts mainly in the elongation phase of transcription. In addition, the ATPase chromo-ATPase/helicase-DNA binding domain (CHD1) remodels nucleosomes during elongation and termination, interacts with the Set-2, SWI/SNF, Paf, DSIF, and FACT complexes that follow it, and has a chromo-domain which binds to methylated histones. Another ATP-dependent chromatin remodeling complex is the ISWI protein family (imitation switch). However, their components can vary as well as their actions during transcription. The Isw1a complex composed of Isw1p and Ioc3p (Isw one complex) negatively affects nucleosome positioning during transcription initiation. The Isw1b complex, in turn, which consists of Isw1p, Ioc2p, and Ioc4p, regulates and links the transcription elongation and termination phases with mRNA maturity. Isw1p preferentially recognizes di and trimethylated H3-K4 which contribute to its recruitment on RNAP II.

A general remodeling complex, FACT, is an hSpt16 heterodimer and also a specific structure recognition protein-1 (HMG-box protein SSRP1). Several of its subunits have been identified as part of the components of DNA polymerase alpha which interacts with DSIF, Spt6, PAF, Chd1, and histones. Recruitment of FACT to promoters is achieved through the intervention of heat shock chaperone proteins of the Hsp family which destabilize the nucleosome through the selective removal of an H2A/H2B dimer. Hence, this allows RNAP II to surround the nucleosome to transcribe the DNA rolled around it; this activity is mediated by direct H2A/H2B nucleosome-FACT-dimer interactions. Moreover, Spt6 interacts with RNAP II, DSIF, TFIIS, PAF, FACT, and H3 histone to promote assembly of the nucleosome when transcription takes place.

Genomic architecture is not collinear; instead, it is crosslinked and modular. Moreover, several sequences are multifunctional because they are used for multiple transcripts and also as regulatory regions. Inside the nuclei, the genome is spatially organized within chromosomes, and there are genes occupying preferential positions relative to each other and to various nuclear and nucleolar landmarks. There are intra- and inter-chromosomal communications between different genomic regions which appear to play important roles in genome expression, function, and regulation. Some evidence of gene expression regulation mediated by genome organization is described below.

Transcriptional factories: these are regions of the genome with high transcription rates and high concentration of transcription factors and RNAP subunits. These subunits have a tissue-specific tridimensional self-organization for regulating gene expression. Actin can collaborate in the organization of factories, and it is implicated in several nuclear processes such as chromatin remodeling, transcription, and mRNA export. It is also required for the action of all three RNA polymerases during transcription in vivo and in vitro. Transcription factories may be attached to a type of nuclear actin scaffold. In genes transcribed by RNAP II, regulating regions such as LCR and enhancers, which are located close together in transcriptional factory regions, can act on genes far from their chromosomal location.

Position in the genome: the gene position in the chromosome and nucleus is also important for determining gene expression. For example, genes located in the centromeric region or close to it are inactive, and telomeric genes may be lost due to telomere shortening. In addition, centromeres are specialized regions at the center of chromosomes that are compounded of heterochromatin, which plays a critical role in the accurate segregation of duplicated chromosomes during cell division. In contrast, telomeres are regions located at the termini of linear chromosomes and consist of repeated DNA sequences (TTAGGG) and different telomere binding proteins. Telomere shortening refers to incomplete replication of chromosomal ends during cell division (S phase) which results in the loss of a small fraction of telomeric DNA. This may be used as a marker to determine the biological age of cells, tissues, organs, and probably humans. Moreover, telomerase enzymes synthesize telomere DNA and preserve telomere length and function, but they are absent in most cells. Telomerase can be found in cell types such as stem cells, embryonic cells, germline cells, lymphocytes, and tumor cells. On average, telomeres are longer in naïve cells than in memory T cells, and are longer in CD28+ than in more differentiated CD28− CD8+ T cells. In contrast, the leukocyte and PBMC telomeres are short. Telomere shortening has also been described in association with a number of immune related diseases and as a risk factor for some type of tumors. It has been reported that patients with rheumatoid arthritis and diabetes mellitus (type 1 and type 2) display significantly shortened telomeres in leukocytes or PBMC compared to age-matched healthy controls. This topic will be addressed in depth in the chapter on immunosenescence.

Multifractal analysis genomic: The human genome is one of the most complex molecular structures because it presents mosaicisms between coding and non-coding sequences. The human genome has highly regionalized structures which consist of a complex pattern for gene structure and expression regulation. In order to analyze the fractal characteristics of the human genome, fractal methodologies were used. These methodologies are derived from an approach focused on the degree of fragmentation (or irregularity) existing in nature. Therefore, the fractal nature of the human genome was studied using several sets. Each set produces a fractal dimension translated into a continuous spectrum of exponents, called a singularity spectrum. Then, the degree of multifractality (MD) obtained from this continuous spectrum allows the genetic information content to be measured. Hence, multifractal analysis of chromosomal fragments, chromosomes, and chromosomal regions reveals the presence of repetition units such as the Alu sequences, which confer fractal characteristics on the genome. This is the reason chromosomes can be classified as having low (4, X, 13, 5,18,3, 6, Y, 8, 2, 11), medium (21, 14, 9, 10, 7, 12, 1, 20, 15), or high (16, 22, 17) fractality. Regions with low fractality are related to low genome stability while those with high fractality are associated with high genetic stability, e.g., chromosome 19 presents the highest fractality (Figure 11).

What is the correct flow of information in gene expression

Long-range interaction in MHC Class II Complex.

RNA systems: RNA mapping techniques show that, in the nucleus, there are more non-polyadenylated and non-coding snRNA than mRNAs. These snRNAs are identified as ‘transcripts of unknown function (TUF) and are denoted by the ENCODE consortium. Several transcriptional units overlap and are denominated as sense and antisense, thus many mRNAs have antisense transcripts. With the discovery of several non-coding RNA (ncRNA) species, regulation systems of gene expression based on RNA began to be elucidated. The control provided by non-coding RNA regulates the compound’s stability, modifies its structure, contributes to transcription initiation and translation temporal-spatially and tissue specifically, modulates chromatin organization, regulates alternative splicing, controls sub cellular localization of proteins, and regulates heat shock sensing. Furthermore, functional ncRNAs vary significantly in size from ~22 bp for miRNAs to ~18 kb for XIST (X-inactive-specific transcript) and to ‘108 kb for AIR (antisense IGF2R RNA) ncRNAs. Size variation and diversity of events in which ncRNAs participate could exceed the protein regulation system in quantity. The best-known RNA-dependent regulation systems are those mediated by snRNAs: non-coding RNA (ncRNAs), small nucleosomal RNA (snoRNA), and microRNAs (miRNAs).

miRNAs are a family of small ribo-oligonucleotides which constitute the main group of post-transcriptional gene regulators, but they do not translate into proteins. miRNAs regulate cell proliferation, apoptosis, and differentiation and intervene in cell processes such as signal transduction, metabolism, and mRNA useful life. Moreover, genes encoding for miRNA have been located in cancerous cells in regions with fragilities or with a high rate of chromosomal aberrations and often cause a loss of function. A cluster of miRNA genes was described on human chromosome 19 (C19MC) among repeated Alu sequences. Some were located in intronic regions and others, in exonic regions. Thirty percent of the genes are believed to be regulated by miRNAs. In the immune system, several processes regulated by miRNAs have been described: ablation of the miRNA machinery as well as the loss of function of some molecules contribute to the formation of certain components and are associated with the development of autoimmunity and cancer. For example, miR-223 controls generation and activation of granulocytes in the inflammatory response. Its loss in mice results in increased granulocyte progenitor cells, hyper mature granulocytes that are hypersensitive to activation stimuli, and increased antifungal activity. In psoriasis, a chronic inflammatory cutaneous disease, over-expression of miRNA-203 and a reduction of its target gene – the suppressor of cytokine signaling 3 (SOCS-3) – was found in psoriatic plaques. SOCS-3 participates in the inflammatory response and in some functions of keratinocytes. A good example of RNA system regulation of gene expression is the tumor-suppressor gene called phosphatase and tensin homolog (PTEN). PTEN is a negative regulator of the PI3K-Akt pathway and is epigenetically silenced in several cancers. Also, PTEN is an indispensable regulator of B cell homeostasis. PTEN expression is post-transcriptionally regulated by the action of a PTEN pseudogene (PTENpg1). This is a long, noncoding RNA (lncRNA) that sequesters numerous PTEN-targeting miRNAs by acting as a miRNA sponge (Figure 12).

Cells are in constant communication with the extracellular medium. Cells are able to detect and interpret extracellular signals that regulate the expression of target genes through mechanisms which vary from posttranslational modification of proteins leading to their activation, inhibition, or degradation to the activation or repression of genetic programs. There is permanent cell signaling among nearby and distant cells belonging to the same individual in whom signals coming from different sources can induce cell responses. These responses include activation or inactivation of genetic programs which lead to proliferation, differentiation, cell death, senescence, or accomplishment of cell functions within a tissue.

The signals can have different molecular structures or sizes, especially those that are hydrophilic or hydrophobic in nature, and have specific mechanisms to transmit information within the cell. Likewise, signals can also be physical, magnetic, electric, etc. Hydrophobic signal molecules such as vitamin D, thyroid and steroid hormones, retinoids, and eicosanoids can cross the plasma membrane and bind to receptor molecules at the cytoplasm or inside the nucleus.

These signal molecules are associated with long-lasting responses due to their high blood permanence. Receptors for these compounds have interaction domains with the ligand, the DNA, and transcription activation or repression processes, thus conferring the characteristics of transcription factors on these receptors.

Hydrophilic signal molecules such as neurotransmitters, protein hormones, glycoprotein hormones, and local chemical mediators, require the presence of a receptor molecule in the plasma membrane. These signal molecules mediate short-duration responses and have low blood permanence. Their ligand-receptor complex may be endocytosed or not, the ligand may be destroyed, and the receptor can be recycled or destroyed. Note that hydrophilic molecules act on G-coupled receptors, ion channels, or tyrosine kinase receptors. When the ligand-receptor bond occurs, it can undergo modifications in its structure which lead to cascading molecular changes until a cell response is generated. In the case of receptors coupled to G trimeric (α, β and γ subunits) proteins, ligand binding induces their dissociation (β and γ units run) and mobilization within the membrane until they interact with effector proteins. These proteins in turn activate molecules downstream and the effect falls on second messengers such as AMPc, GMPc, Ca2+, and DAG. These, in turn, act and modulate the state of activation of target molecules such as PKA, PKG, Calmodulin, PKC, and Serine/Threonine kinases respectively. These molecules also act on different targets including transcription factors, cytoskeleton structures, and mitochondria proteins. Expression of genetic programs may be modulated in the cascade, thus leading to proliferation, differentiation, cell death, senescence, or functional activity on a tissue. On the same pathway, cell characteristics such as adhesion, migration, and survival can also be modulated. Receptors associated with ion channels, in turn, modulate the transportation of substances between the extracellular medium and the interior of the cell by opening or closing the channels, thus altering the membrane’s permeability and excitability. Finally, receptors with autocatalytic activity located in the intracytoplasmic region have a tyrosine kinase domain which is activated when the ligand binds to the receptor, which generates a structural change making the receptor accessible to adapter proteins. Afterwards, proteins are activated in cascade and act upon different target molecules. The MAPK pathway is the classical example for these types of signal transduction pathways.

Recently, a plethora of signaling pathway members as well as molecules involved in the interaction between pathways have been identified. This reveals a communication network system that is difficult to interpret and understand but provides clues to the complex regulation system of gene expression.

1.

Abou-Zeid A, Saad M, Soliman E. MicroRNA 146a expression in rheumatoid arthritis: association with tumor necrosis factor-alpha and disease activity. Genet Test Mol Biomarkers. 2011;15:807–12. [PubMed: 21810022]

2.

Brémond A, Meynet O, Mahiddine K, Coito S, Tichet M, Scot-landi K, et al. Regulation of HLA class I surface expression requires CD99 and p230/golgin-245 interaction. Blood. 2009;113:347–57. [PubMed: 18849489]

3.

Böhm S, Östlund NK. Chromatin Remodelling and RNA Processing. In: Grabowski Paula, editor. Biochemistry, Genetics and Molecular Biology “RNA Processing”. University of Pittsburgh; USA: 2011. Chapter 1.

4.

Campos E, Reinberg D. Histones: Annotating Chromatin. Annu Rev Genet. 2009;43:559–99. [PubMed: 19886812]

5.6.7.8.

Grimm D. Small silencing RNAs: State-of-the-art. Adv. Drug Delivery Rev. 2009;61:672–703. [PubMed: 19427885]

9.

He L, Hannon G. Micro RNAs: small RNAs with a big role in gene regulation. Nat Rev Genet. 2004;5:522–34. [PubMed: 15211354]

10.

Johnsson P, Ackley A, Vidarsdottir L, Lui WO, Corcoran M, Grandér D, Morris K. A pseudogene long-noncoding-RNA network regulates PTEN transcription and translation in human cells. Nat. Struct. Mol. Biol. 2013;20:440–6. [PMC free article: PMC3618526] [PubMed: 23435381]

11.

Kapranov P, Willingham A, Gingeras T. Genome-wide transcription and the implications for genomic organization. Nat Genet. 2007;8:413–23. [PubMed: 17486121]

12.

Koh AS, Kuo AJ, Park SY, Cheung P, Abramson J, Bua D, Carney D, Shoelson SE, Gozani O, Kingston RE, Benoist C, Mathis D. Aire employs a histone-binding module to mediate immunological tolerance, linking chromatin regulation with organ-specific auto-immunity. Proc Natl Acad Sci U S A. 2008;105:15878–83. [PMC free article: PMC2572939] [PubMed: 18840680]

13.

Krishnamurthy S, Hampsey M. Eukaryotic transcription initiation. Curr Biology. 2009;19:R153–56. [PubMed: 19243687]

14.

Kuryłowicz A, Nauman J. The role of nuclear factor-kappaB in the development of autoimmune diseases: a link between genes and environment. Acta Biochim Pol. 2008;55:629–47. [PubMed: 19081854]

15.

Majumder P, Gomez JA, Chadwick BP, Boss JM. The insulator factor CTCF controls MHC class II gene expression and is required for the formation of long-distance chromatin interactions. J Exp Med. 2008;205:785–98. [PMC free article: PMC2292219] [PubMed: 18347100]

16.

Marshall R. A, Colin Echeverría C, Dorywalska M, Puglisi J. Translation at the Single-Molecule Level. Annu Rev Biochem. 2008;77:177–03. [PubMed: 18518820]

17.

Maston GA, Evans SK, Green MR. Transcriptional Regulatory Elements in the Human Genome. Annu. Rev. Genomics Hum. Genet. 2006;7:29–59. [PubMed: 16719718]

18.19.20.

Nagai K, Arito M, Takakuwa Y, Ooka S, Sato T, Kurokawa MS, Okamoto K, Uchida T, Suematsu N, Kato T. Altered posttranslational modification on U1 small nuclear ribonucleoprotein 68k in systemic autoimmune diseases detected by 2D Western blot. Electrophoresis. 2012;33:2028–35. [PubMed: 22806469]

21.22.

Peng SL. Transcription factors in autoimmune diseases. Front Biosci. 2008;13:4218–40. [PubMed: 18508507]

23.24.

Saitoh S, Miyake K. Regulatory molecules required for nucleotide-sensing Toll-like receptors. Immunol Rev. 2009;227:32–43. [PubMed: 19120473]

25.

Saunders A, Leighton J, Lis JT. Breaking Barriers to transcription elongation. Cell Biol. Nat Rev Mol Cell Biol. 2006;7:557–67. [PubMed: 16936696]

26.

Sexton T, Umlauf D, Kurukuti S, Fraser P. The role of transcription factories in large-scale structure and dynamics of inter-phase chromatin. Semin Cell Dev Biol. 2007;18:691–97. [PubMed: 17950637]

27.

Sims R, Belotserkovskaya R, Reinberg D. Elongation by RNA polymerase II: the short and long of it. Genes Dev. 2012;18:2437–68. [PubMed: 15489290]

28.29.

Vitaliano-Prunier A, Babour A, Hérissant L, Apponi L, Margaritis T, Holstege FC, et al. H2B Ubiquitylation controls the formation of export-competent mRNP. Mol Cell. 2012 Jan 13;45(1):132–9. [PMC free article: PMC3259529] [PubMed: 22244335]

30.31.

Wanga Z, Yao H, Lin S, Zhu X, Shen Z, Lu G, Sang Poon W, Xie D, Chia-mi Lin M, Kung H. Transcriptional and epigenetic regulation of human microRNAs. Cancer Lett. 2013;331:1–10. [PubMed: 23246373]

32.