Bonanza Offer FLAT 20% off & $20 sign up bonus Order Now
AS410
US
Johns Hopkins University
A known protein that is similar sequence wise to our novel protein is DNA damage recognition and repair factor (xpa) mRNA from Takifugu rubripes with a 91.21% identity on a BLASTN search and an E-value of 0. Using SWISS Model, The predicted structure of the novel protein and known protein look similar and both are monomers. With the SWISS Model database, both proteins are being compared to the model template of “DNA repair protein complementing XP-A cells” with the novel protein having a 72.05% sequence identity whereas the known protein has a 64.8% sequence identity. Both Ramachandran plots show that both the known protein and the novel protein have a-helix structures and the protein models on the left of both images show the helix-like structures in the protein sequencing occurring. Between the novel and known protein, there are more similarities than differences that are occurring, which can mean that this can be a conserved protein found in a different species that has not been studied. The only differences occurring is the differences in sequence identity occurring through the SWISS Model template 6ro4.1.l.
The design of a SWISS-MODEL works with minimal user inputs, such as the sequence of amino acids. Comparative modelling shows various complexities and further user inputs may be required for certain modelling projects, such as selecting various tempelates or adjusting the alignment in target tempelate. The SWISS-MODEL server is capable of providing the user with three interaction modes that include first approach, alighnment, and the project modes. In first mode, only amino acide sequence is requred as input data. In the alignment mode, the alignment is specified by the user in correspondence proteins from ExPDB template. Project mode ensures that the users have control over parameters such as gap placement or selection of template (Schwede,Kopp,Guex & Peitsch, 2003).
Torafugu Protein
To determine whether the gene is under positive or negative evolutionary selection, Tajima’s D test was used. This test method is used to measure the differences and segregation in sites of the measured samples. According to Tajima, the founder of this test method, a negative D value indicates a positive evolutionary selective sweep. This positive selective sweep indicates that no major changes have taken place and that the gene will proceed with little to no mutation (Tajima 1989.) As a group, we ran this test on 15 different samples in two different formats. A Protein alignment and an mRNA alignment. The results from those tests are illustrated in the images below.
To obtain such results, MEGA X software was used to prepare the data sets and run the test mentioned previously. The method began by obtaining the 15 sequences in both protein and mRNA formats. Then, those sequences were uploaded to MEGA X as new FASTA projects and aligned using the MUSCLE alignment tool. This alignment step is important in terms of tidying the data up and ensuring accurate results. After the alignment was complete, the results were extracted as MEGA format files and saved for later use. For clarification, the method was followed separately for the mRNA and protein. Once the data is ready, the file is then opened using MEGA X and a selection test is run on said data. In this case, the test method used was Tajinma’s D test method.
Table 1: Protein sequence analysis using Tajima’s test to determine the evolutionary selection sweep direction.. Where m is the sample size, S is the number of segregation sites, ps is the probability of segregation per site, is the expected value of of pairwise differences, is the average number of pairwise differences, and D is the neutrality value, or Tajiima’s D value.
Table 2: mRNA sequence analysis using Tajima’s test to determine the evolutionary selection sweep direction. Where m is the sample size, S is the number of segregation sites, ps is the probability of segregation per site, is the expected value of of pairwise differences, is the average number of pairwise differences, and D is the neutrality value, or Tajiima’s D value.
The first table shows the results from Tajima’s Neutrality Test from the protein sequences and the statistical variable, D, is a negative number. With a negative D value, indicates there are rare mutations occurring in the sequence and indicating positive selection for the sequence. The variable represents the average number of pairwise differences and is the expected value of pairwise differences. From both tables, and are not close in value, and is a larger value. With the expected pairwise differences to be larger than the actual average of pairwise differences shows that there is selection towards the gene. With D being a negative value proves there is positive selection.
When applying genome-wide prior during the analysis of the results, there is likelihood of observing deviations from the expected Tajima’s D values even when utilizing low depth of data. Contrary to applying the EB approach, there is increased chances of observing bias in the areas where the selection was made. However, the bias is so small that which would be seen in GC approach. The main disadvantage of using Tajima’s D test is they have very weak dependence contrary to the expectations on depth sequencing. As such, it would ensure the use of genome wide scans on genomes presenting with varying sequences.
The novel gene comes from spotted green pufferfish and it is a cDNA sequence. Running BLASTN search, it seems like this gene is closely related to a family of genes that are related to DNA damage recognition and repair factors. The closest sequence to our novel protein is from ACC XM_003972574.3 and it is the DNA damage recognition and repair factor (xpa), mRNA from Takifugu rubripes. The conserved domain for the protein ACC XP_003972623 is part of the superfamily cl40443 and is called DNA repair protein, when using the “Identify Conserved Domains” options on NCBI Nucleotide Database. The DNA repair protein family recognizes DNA damage as a part of nucleotide excision repair. While looking at the full conserved domains for this protein relating to our novel gene, there is the DBD_XPA-like superfamily and other DNA repair protein domains. The DBD_XPA-like superfamily includes the RNA repair protein complementing XP-A cells (XPA), yeast DNA repair protein RAD14, zinc transporter9 (ZNT9), and similar proteins. There is a conserved N-terminal zinc-binding subdomain and a C-terminal alpha/beta fold subdomain. This protein family has several conserved domains and proves to have positive selection.
Some of the first fdata pointing towards the depilication of genes in these fish originate from hox genes and their clusters. Hox genes play the critical role of encoding the proteins that bind DNA thus specifying the fate of tyhe cell in anterior-posterior axis. These genes in cluster fish have about seven to eight gene clusters thus suggesting the occurence of a genome duplication before their final divergence. The most recent genomic studies reveal two copies of gene clusters in fish but a single copy in vertebrates.
Schwede, T., Kopp, J., Guex, N., & Peitsch, M. C. (2003). SWISS-MODEL: An automated protein homology-modeling server. Nucleic acids research, 31(13), 3381–3385. https://doi.org/10.1093/nar/gkg520
Are you in dire need of assignment help in the UK? Can’t figure out who can help you whenever you find yourself thinking, “Wouldn’t it be great if I could pay someone to do my assignment?” With Myassignmenthelp.co.uk, you can fulfil your desires without any hassle.
Send us your requirements, and our paper writers will take care of your assignment worries quickly. So now, you don't have to worry about, "Where can I find someone to do my assignment for me in the UK?” Instead, let our experts provide you with the best assignment help in London, Bristol, Manchester, Liverpool and more!
Upload your Assignment and improve Your Grade
Boost Grades