SESL Research Assistants Take Part in the First St Petersburg R User Group
The first meeting of the St Petersburg R User Group took place on March 28th. This group brings together experts who use the programming language R for statistical modelling in their research. The format of the meetings of the R User Group suggests that the community members propose potentially interesting topics for their presentations, and the meetings are mostly used for sharing findings and for networking. Even though Data Science is an area mostly for mathematicians, engineers and analysts, several HSE students and research assistants of SESL took part in the latest meeting.
Nastya Nesterenko, 2nd year, Department of Sociology:
"I liked the fact that the presentations were mostly clear and intelligible. The presenters were talking about things that we sociologists do as well. They also discussed some basic topics, and their talks were not tedious. I liked also that I could take something new from every presentation, and at times it was just good to brush up on my knowledge of the topic."
There were four presentations at the latest meeting. Alexey Shlemov spoke about code optimization in R, and about additional packages for working with resource-intensive tasks. The presentation by Filipp Upravitelev concerned the caret, a package of applied predictive modelling, its features, types of supported models, pre-processing of data and analysis of results. Anton Antonov discussed the family of packages tidyr, dplyr, and magrittr for data processing. "With magrittr you can make the code more readable, he said, but it's better not to over-use it." He also showed how usage of tidyr and dplyr may make the data easier to analyze.
And finally, Alexey Natekin talked about the organization of large projects in R. In his opinion, it is really important that the code should be clear not only to the author, but also to other programmers. For this, the code should have an easily understandable structure. This is important to ensure smooth cohesion. However, this presentation sparked heated discussions; not everyone in the audience agreed with the speaker.
Ilya Musabirov, Lecturer, Programming and Data Analysis courses:
"It is important that our students can even now participate in the work of the R User group. Its growing scope and interdisciplinary character make it a good meeting ground, where mathematicians, programmers, sociologists, psychologists, and linguists can find a common language. For the last three groups, it’s really important that they can participate fully in this dialogue, and make a meaningful contribution. Not everything in the presentations and discussions was clear for the students- for example, the problem of code optimization- but they have just started learning. Some things are familiar but they are being presented in a new way. And some things they know as well as other participants, for example, the infrastructure of data processing based on dplyr and magrittr, because we give that to the sociologists starting from the first year. We hope that in next year, when the minor course in Data Science launches in St Petersburg HSE, we shall be able to participate in the St Petersburg R User Group more actively, as well in the R community on the whole."
By Denis Bulygin