Data Analysis

Introduction

Data analysis is particularly important for acquiring new knowledge and for the decision-making process. When studying a component of a broader system, the analysis of collected data combined with the initial knowledge of the subject leads to new insights, which in turn contribute to making appropriate decisions for the improvement of the system. Data analysis plays a central role in system evaluation, as part of the continuous improvement process of the provided public transport services. A solid preliminary understanding of the urban transport system and the services under evaluation helps the researcher in formulating the appropriate methodology for defining and calculating evaluation indicators, which includes the collection of suitable data and their further analysis.

Methods of Data Analysis

Descriptive Statistics

The most common type of analysis performed on the data collected by researchers is the calculation of simple and typical statistical measures, such as means, weighted means, variances, and frequencies. This basic statistical analysis is applied both to quantitative and qualitative characteristics of an urban transport system. For this purpose, descriptive statistics are used, aiming at a concise yet comprehensive presentation of the data of a study. Descriptive statistics include quantitative measures, such as measures of central tendency (mean, median, mode), measures of dispersion (range, variance, standard deviation), and measures of relative position (percentiles, interquartile ranges). They also include qualitative measures, such as frequencies (absolute, relative, cumulative, and cumulative relative).

Quartile Analysis

Quartile analysis is a method of analyzing qualitative characteristics, simple to apply, and is used to evaluate the quality of provided transport services. Its purpose is to identify the characteristics that need immediate improvement. Through this method, the average importance and satisfaction ratings for each qualitative characteristic are correlated. First, the qualitative characteristics to be evaluated are determined through a survey, with each characteristic receiving one score for its importance and another for satisfaction. Then, these scores are plotted on an importance–satisfaction diagram, where the managing authority sets the thresholds for “important/not important” and “satisfied/not satisfied.” In this way, characteristics are classified as performing well or poorly and being important or not. The aim is not only to identify characteristics that require improvement but also to maintain at a high level those that are important and perform well, as they provide a competitive advantage. A drawback of this method is the often arbitrary definition of quartile thresholds.

Impact Analysis

Impact analysis identifies the relative effect of service characteristics on the overall satisfaction of passengers when a recent issue arises in one of them. The aim is to determine which qualitative characteristics have the greatest negative impact on overall satisfaction, as well as the highest number of passengers affected. The process consists of three steps. In the first step, the qualitative characteristics with the strongest impact are identified by comparing the average overall satisfaction ratings between passengers who experienced a problem and those who did not. In the second step, the frequency of the problem is recorded, i.e., the percentage of passengers reporting an issue with each characteristic. Sometimes a characteristic may have a large impact but be reported rarely, or conversely, a small impact but reported by many, which increases its overall significance. In the third step, a composite “impact index” is calculated by multiplying the difference in mean overall satisfaction scores by the percentage of passengers reporting the problem. Characteristics with the highest impact index are prioritized as they most strongly affect passenger satisfaction.

Factor Analysis

Factor analysis is frequently applied in travel surveys, aiming to capture the main travel characteristics and the critical factors influencing passenger choices. Its purpose is to reduce the initially defined variables into a smaller number of factors that represent unobserved dimensions. These latent factors correspond to qualitative characteristics that passengers consider important. This method attempts to identify the hidden factors shaping passenger perceptions and evaluations, preserving useful information while avoiding model overloading. Often, the method of maximizing variance through variable transformation is applied, rotating the original variables to highlight the most significant factors.

Discrete Choice Analysis

Discrete choice analysis aims to create a behavioral model that describes the decisions made by travelers when choosing between different alternatives, based on their personal characteristics, needs, and the attributes of the available options. Each alternative is described by specific attributes, which travelers evaluate when making their choice. Therefore, the model explains the relationship between the traveler’s socioeconomic status, the characteristics of the alternatives, and the demand they generate. These models are based on the principle of utility maximization and must meet three conditions: the choice of one alternative excludes the others, the set of alternatives is complete, and the set of alternatives is finite. Discrete choice models are classified as aggregate or disaggregate. Aggregate models approach the problem macroscopically, analyzing the average behavior of the population, while disaggregate models focus microscopically on the individual traveler, estimating the probability of each choice based on the traveler’s characteristics and the utility of the alternatives. Disaggregate models provide more accurate insights, as they capture real decision-making, whereas aggregate models often lose this detailed information.

Analysis Tools

For simple statistical analyses, such as the calculation of statistical measures, quartile analysis, and impact analysis, spreadsheet software is sufficient. Programs such as Microsoft Excel, Apple Numbers, Open Office, and Google Sheets are widely used, offering collaborative features and effective data presentation. For more specialized analyses, such as factor analysis or logit models, statistical software packages are necessary. SPSS is the most widely used, offering reliability and extensive data analysis capabilities. R is a powerful programming language for data analysis, supporting various statistical methods. Gnumeric provides a range of descriptive and inferential techniques, while SSP includes all basic statistical analysis methods.

Ways of Presenting Results

The results of data analysis can be presented graphically using various methods. Pie charts are used for the representation of percentage frequencies, bar charts for visualizing frequencies of categories, histograms for quantitative variables grouped into classes, and boxplots for depicting the main distribution characteristics and identifying outliers. Beyond these, many other graphical formats exist, supported by both spreadsheets and statistical packages, depending on the type of analysis performed.

Conclusion

The publication of results is essential and should be made available to the public, either through official websites of public transport organizations or through printed media. Periodic press releases should also be issued with the main findings of surveys, along with announcements and scientific papers presented in conferences, workshops, and other events.

Introduction

Methods of Data Analysis

Descriptive Statistics

Quartile Analysis

Impact Analysis

Factor Analysis

Discrete Choice Analysis

Analysis Tools

Ways of Presenting Results

Conclusion

SEARCH

QUESTIONNAIRE

RECENT POSTS

Data Analysis

Introduction

Methods of Data Analysis

Descriptive Statistics

Quartile Analysis

Impact Analysis

Factor Analysis

Discrete Choice Analysis

Analysis Tools

Ways of Presenting Results

Conclusion

Related Posts

Hypothesis Testing

Collinearity

Research Gaps

Semantic Word Learning Test [SWLT-16]

SEARCH

QUESTIONNAIRE

RECENT POSTS