?url_ver=Z39.88-2004&rft_id=1817051037&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Adc&rft.title=IMPLEMENTASI+ALGORITME+SMOTE+DAN+KLASIFIKASI+RANDOM+FOREST+PADA+IMBALANCED+DATA+METILASI+SEQUENCE+PROTEIN+LISIN+&rft.creator=Annisa+Nurwalikadani%2C+1817051037&rft.subject=004+Pemrosesan+data+dan+ilmu+komputer&rft.subject=005+Pemrograman+komputer%2C+program+dan+data&rft.description=Salah+satu+masalah+yang+sering+ditemukan+pada+saat+pengolahan+data+adalah+ketidakseimbangan+jumlah+sampel+dari+masing-masing+kelas+dalam+data+atau+yang+biasa+disebut+dengan+imbalanced+data.+Dalam+beberapa+tahun+terakhir%2C+banyak+penelitian+mengenai+ketidakseimbangan+data+di+bidang+bioinformatika%2C+khususnya+Post+translational+Modification+(PTM).+Studi+kasus+dalam+penelitian+ini+adalah+masalah+ketidakseimbangan+data+metilasi+sequence+protein+lisin.+Masalah+ketidakseimbangan+data+tentunya+akan+mempengaruhi+hasil+klasifikasi.+Oleh+karena+itu+diperlukan+suatu+metode+untuk+mengatasi+masalah+tersebut%2C+salah+satunya+adalah+algoritme+SMOTE+yang+akan+digunakan+dalam+penelitian+ini.+%0D%0ATujuan+dari+penelitian+ini+adalah+untuk+menganalisis+kinerja+algoritma+SMOTE+dan+klasifikasi+random+forest+dalam+menangani+masalah+ketidakseimbangan+data.+Dataset+metilasi+urutan+protein+lisin+diperoleh+dari+http%3A%2F%2Fwww.uniprot.org%2F+dengan+mencari+kata+kunci+%22metilasi%22%2C+memiliki+1000+data+positif+dan+172+data+negatif+serta+memiliki+panjang+sequence+15+asam+amino.+Pemisahan+dataset+menjadi+2%2C+yaitu+80%25+data+latih+20%25+data+uji+dan+90%25+data+latih+10%25+data+uji.+Fitur+ekstraksi+yang+digunakan+adalah+AA+Index%2C+PseACC%2C+Hydrophobicity%2C+dan+CTD.+Kemudian%2C+data+hasil+ekstraksi+diolah+menggunakan+klasifikasi+random+forest+dengan+parameter+ntree+500%2C+800%2C+dan+1000+serta+mtry+7%2C+9%2C+dan+14.+Hasil+tertinggi+diperoleh+pada+pemisahan+dataset+80%25+data+latih+20%25+data+uji+dengan+mtry+14+ntree+500%2C+menghasilkan+akurasi+95%2C65%25%2C+sensitivitas+96%2C2%25%2C+spesifisitas+95%25%2C+dan+MCC+91%2C25%25.%0D%0A%0D%0AOne+problem+that+is+often+encountered+when+processing+data+is+an+imbalance+in+the+number+of+samples+from+each+class+in+the+data+or+what+is+commonly+called+imbalanced+data.+In+recent+years%2C+there+have+been+many+studies+regarding+imbalanced+data+in+the+field+of+bioinformatics%2C+especially+Post+translational+Modification+(PTM).+The+case+study+in+this+research+is+the+problem+of+imbalanced+data+lysine+protein+sequence+methylation.+The+problem+of+imbalanced+data+will+certainly+affect+the+classification+results.+Therefore%2C+a+method+is+needed+to+deal+with+this+problem%2C+one+of+which+is+the+SMOTE+algorithm+that+will+be+used+in+this+research.+The+purpose+of+this+research+is+to+analyze+the+performance+of+the+SMOTE+algorithm+and+random+forest+classification+in+handling+with+imbalanced+data+problems.+The+lysine+protein+sequence+methylation+dataset+was+obtained+from+http%3A%2F%2Fwww.uniprot.org%2F+by+searching+for+the+keyword+%22methylation%22%2C+has+1000+positive+data+and+172+negative+data+and+has+a+sequence+length+of+15+amino+acids.+The+dataset+separation+into+2%2C+80%25+training+data+20%25+test+data+and+90%25+training+data+10%25+test+data.+The+feature+extraction+used+is+AA+Index%2C+PseACC%2C+Hydrophobicity%2C+and+CTD.+Then%2C+the+extracted+data+was+processed+using+a+random+forest+classification+with+ntree+parameters+500%2C+800%2C+and+1000+and+mtry+7%2C+9%2C+and+14.+The+highest+results+were+obtained+in+the+separation+of+the+dataset+80%25+training+data+20%25+test+data+with+mtry+14+ntree+500%2C+resulting+in+95.65%25+accuracy%2C+96.2%25+%0D%0Asensitivity%2C+95%25+specificity%2C+and+91.25%25+MCC.%0D%0A%0D%0A&rft.publisher=Matematika+dan+Ilmu+Pengetahuan+Alam&rft.date=2022-11-24&rft.type=Skripsi&rft.type=NonPeerReviewed&rft.format=text&rft.identifier=http%3A%2F%2Fdigilib.unila.ac.id%2F67956%2F1%2FABSTRAK%2520%2528ABSTRACT%2529.pdf&rft.format=text&rft.identifier=http%3A%2F%2Fdigilib.unila.ac.id%2F67956%2F2%2FSKRIPSI%2520FULL.pdf&rft.format=text&rft.identifier=http%3A%2F%2Fdigilib.unila.ac.id%2F67956%2F3%2FSKRIPSI%2520FULL%2520TANPA%2520PEMBAHASAN.pdf&rft.identifier=++Annisa+Nurwalikadani%2C+1817051037++(2022)+IMPLEMENTASI+ALGORITME+SMOTE+DAN+KLASIFIKASI+RANDOM+FOREST+PADA+IMBALANCED+DATA+METILASI+SEQUENCE+PROTEIN+LISIN.++Matematika+dan+Ilmu+Pengetahuan+Alam%2C+Universitas+Lampung.+++++&rft.relation=http%3A%2F%2Fdigilib.unila.ac.id%2F67956%2F