Predicting perceived ethnicity with data on personal names in Russia
dc.contributor.author | Bessudnov, A | |
dc.contributor.author | Tarasov, D | |
dc.contributor.author | Panasovets, V | |
dc.contributor.author | Kostenko, V | |
dc.contributor.author | Smirnov, I | |
dc.contributor.author | Uspenskiy, V | |
dc.date.accessioned | 2023-04-04T13:26:51Z | |
dc.date.issued | 2023-04-04 | |
dc.date.updated | 2023-04-04T12:28:20Z | |
dc.description.abstract | In this paper, we develop a machine learning classifier that predicts perceived ethnicity from data on personal names for major ethnic groups populating Russia. We collect data from VK, the largest Russian social media website. Ethnicity was coded from languages spoken by users and their geographical location, with the data manually cleaned by crowd workers. The classifier shows the accuracy of 0.82 for a scheme with 24 ethnic groups and 0.92 for 15 aggregated ethnic groups. It can be used for research on ethnicity and ethnic relations in Russia, with the data sets that have personal names but not ethnicity. | en_GB |
dc.identifier.citation | Published online 4 April 2023 | en_GB |
dc.identifier.doi | https://doi.org/10.1007/s42001-023-00205-y | |
dc.identifier.uri | http://hdl.handle.net/10871/132840 | |
dc.identifier | ORCID: 0000-0002-2541-9794 (Bessudnov, Aleksei) | |
dc.language.iso | en | en_GB |
dc.publisher | Springer | en_GB |
dc.relation.url | https://github.com/abessudnov/ruEthnicNamesPublic | en_GB |
dc.rights | © The Author(s) 2023. Open Access. This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ | en_GB |
dc.subject | ethnicity | en_GB |
dc.subject | Russia | en_GB |
dc.subject | machine learning | en_GB |
dc.subject | prediction | en_GB |
dc.subject | personal names | en_GB |
dc.title | Predicting perceived ethnicity with data on personal names in Russia | en_GB |
dc.type | Article | en_GB |
dc.date.available | 2023-04-04T13:26:51Z | |
dc.description | This is the final version. Available on open access from Springer via the DOI in this record | en_GB |
dc.description | Data availability statement: The research data supporting this publication and the Python code are openly available from Github at: https://github.com/abessudnov/ruEthnicNamesPublic | en_GB |
dc.identifier.eissn | 2432-2725 | |
dc.identifier.journal | Journal of Computational Social Science | en_GB |
dc.relation.ispartof | Journal of Computational Social Science | |
dc.rights.uri | https://creativecommons.org/licenses/by/4.0/ | en_GB |
dcterms.dateAccepted | 2023-03-13 | |
dcterms.dateSubmitted | 2022-12-20 | |
rioxxterms.version | VoR | en_GB |
rioxxterms.licenseref.startdate | 2023-04-04 | |
rioxxterms.type | Journal Article/Review | en_GB |
refterms.dateFCD | 2023-04-04T12:28:22Z | |
refterms.versionFCD | AM | |
refterms.dateFOA | 2023-04-04T13:26:52Z | |
refterms.panel | C | en_GB |
Files in this item
This item appears in the following Collection(s)
Except where otherwise noted, this item's licence is described as © The Author(s) 2023. Open Access. This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/