To appreciate the loadingįunctionalities as implemented by the library Pandas, we will first write our own data Before we canĪnalyze the dataset, the first task is to load it into Python. Provides all names for girls and boys that occur at least five times. The United Sates from the nineteenth century until modern times. As an example dataset, we will work with the baby name data as provided by theīe found in the file data/names.csv, which contains records in the naming of children in In this section, we will demonstrate how to load, clean and inspect tabular data with Loading, Inspecting, and Summarizing Tabular Data # In section Conclusions and Further Reading with an overview of some resourcesįor further reading on the subject of analyzing tabular data with Python. The other hand, to demonstrate advanced data manipulation techniques. These case studies serve, one the one hand, toįurther investigate some of the changes of naming practice in the United States, and, on After that, we work on a few small case studies in sectionĬhanging Naming Practices. Library can be employed to map out the long-term shift in naming practices as addressed by Subsequently, in section Mapping Cultural Change we show how this Library for manipulating and analyzing tabular data: Pandas. Loading, Inspecting, and Summarizing Tabular Data we introduce the most important third-party The structure of the chapter is as follows. Literature that aim to provide explanations for the shift in naming practices. Using Python and Pandas, and, on the other hand, to replicate some of the analyses in the The chapter is, one the one hand, to show how to map out such examples of cultural change We do not seek to find a conclusiveĮxplanation for the observed shifts in the current chapter. Literacy, and decline of the nuclear family). Sociological, demographic, and cultural changes (e.g., industrialization, spread of Underlying this shift in naming practices, because of its co-occurrence with numerous Lieberson, Lieberson and Lynn, it is extremely difficult to pinpoint the factors Of change in leading names has significantly increased. Over long periods of time, more recent data exhibit similarities with fashion trends (see, e.g., Acerbi et al. Rather conservative naming practices, with barely any changes in the most popular names Given to children over the past two centuries. One of the key observations he makes concerns an accelerating rate of change in the names Shifts in child naming practices from a cross-cultural and socio-historical perspective. In his book A Matter of Taste, Lieberson examines long-term The reader will, for instance, learn to make histograms and line plots using the Pandas DataFrame methods plot() or hist().Īs this chapter’s case study, we will examine diachronic developments in child naming The essentials of data visualization are also introduced, on which subsequent chapters in the book will draw. We will cover in detail a number of high-level functions from the Pandas package, such as the convenient groupby() method and methods for splitting and merging datasets. The material presented here should be especially useful for scholars coming from a different scripting language background (e.g., R or Matlab), where similar manipulation routines exist. Historical and sociological research on naming practices- Fischer, Sue and Telles ,Īnd Lieberson are three among countless examples-and because they create a useful context for practicing routines such as column selection or drawing time series. These data are chosen for their connection to existing We focus on a historical dataset consisting of records in the naming of children from the (MetadataĪccompanying text documents is often stored in a tabular format.) As example data here, Often complement text datasets like those analyzed in previous chapters. This chapter provides aĭetailed account of how scholars can use the library to load, manipulate, and analyze tabular data. In this chapter, we review an external library, “ Pandas”, which wasīriefly touched upon in chapter Introduction. This chapter demonstrates the standard methods for analyzing tabular data in Python in the context of a case study in onomastics, a field devoted to the study of naming practices. Tabular datasets are often viewed in a spreadsheet program such as LibreOffice Calc or Microsoft Excel. Each record is associated with a fixed number of fields. Tabular datasets organize machine-readable data (numbers and strings) into a sequence of records. Data-intensive research in the humanities and allied social sciences in general is far more likely to feature the analysis of tabular data than text documents. Data analysis in literary studies tends to involve the analysis of text documents (see chapters chp-vector-space-model and chp-getting-data).
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |