Read Data from CSV File

Introduction to Read Data From CSV File

In R Language one can easily read data from CSV file format. One can use the read.csv() function. There are different ways to read the CSV file in R and the read.csv() function has many useful arguments.

It is important to note that a CSV file is a comma-separated value file. Usually, CSV files are generated from spreadsheet-like software such as MS Excel. Regarding the file type CSV files are very similar to txt files, however, CSV files can be easily opened in MS Excel. The read.csv() function imports the CSV file as a data frame in R Language, a fundamental data structure in R.

read.csv Function in R

Using the read.csv function in R one can read the data from a CSV file by choosing the file (a dialog box opens to select the appropriate file). This is the easy way to choose a data file as the user does not need to type the file path. For example,

data <- read.csv(file.choose(),header =TRUE)

The file.choose() argument will open a dialog box for the selection of the required file.

Read Data from CSV File

After selecting the data file, one can use the data and may display and get the data information, such as

head(data)
str(data)

There is another way to read the data by giving the complete path to the file with the data file name and its extension. The read.csv function in R can be used with important arguments, such as file path and header=TRUE.

data <- read.csv("C:\\book1.csv", header=TRUE)
data <- read.csv("C:\\mywork\\data\\book1.csv", header=TRUE)

After reading the data file, one can check the names of each variable by using names() function.

names(data)

Selecting Variables from Data Object

One can select a column (variable) by using square brackets and column index or by use of a dollar sign. For example

data$X1    # Selects the variable X1
data[, 1]  # selects the variable in column 1
data[, 4]  # selects the variable in column 4
data[, 1:3] # selects column 1, 2 and 3 

Similarly, one can also select the rows from a data file. For example

data[12, ]   # select the 12 observation/ row of all variables (columns)
data[5:10, ] # selects rows 5 to 10 with all columns/variables

One can also subset the data by using some conditional operator. For example, the following command reads $X_1$ variable from data having greater than 0.7 values.

data1[data1$X1 > 0.7, ]

Read a CSV File as a Table

One can also read a CSV file as a table. For example,

data <- read.table("C:\\data.csv",sep ",",header True)

Some important arguments related to read.csv() function:

  • file: The file argument is used to specify the path to the CSV file. One can provide either the absolute path (e.g., “C:/Users/yourname/Documents/data.csv”) or the relative path if the file is in the working directory.
  • header (optional): The header argument is logical (either TRUE or FALSE), it indicates whether the first row of the CSV file contains names of the columns. By default, header=TRUE. In case, if the file does not have a header row, set it header=FALSE.
  • sep (optional): The sep argument specifies the delimiter (separator) used between values in the CSV file. The default is a comma (“,”).
  • dec (optional): The dec argument defines the decimal point character used in the CSV file. The default is “.”.
RFAQS.com Read Data From CSV File

https://itfeature.com

https://gmstat.com

Import Data in R, Reading, and Creating Data

There are many ways to read data into R Language.  We will learn here how to import data in R Language too. We can also generate certain kinds of patterned data. Some of them are:

Reading Data from the Keyboard Directly

For small data (few observations) one can input data in vector form directly on R Console, such as

x <- c(1, 2, 3, 4, 5)
y <-c('a', 'b', 'c')

In vector form, data can be on several lines by omitting the right parentheses, until the data are complete, such as

x <- c(1, 2 
       3, 4)

Note that it is more convenient to use the scan function, which permits the index of the next entry.

Using Scan Function

For small data sets it is better to read data from the console by using the scan function. The data can be entered on a separate line, by using a single space and/or tab. After entering the complete required data, pressing twice the enter key will terminate the scanning.

X <- scan()
1:   3 4 5
4:   4 5 6 7
8:   2 3 4 5 6 6
14:
Read 13 items

Reading String Data using the “what” Option

y <- scan(what=" ")
1:    red green blue
4:    white
5:
Read 4 items

The scan function can be used to import data. The scan function returns a list or a vector while read.table function returns a data frame. It means that the scan function is less useful for imputing “rectangular” type data.

Reading data from ASCII or plain text files into R as Data Frame

The read.table function reads any type of delimited ASCII file. It can be numeric and character values. Reading data into R read.table is the easiest and most reliable method. The default delimiter is a blank space.

data <- read.table(file=file.choose()) #select from dialog box

data <- read.table("http://itfeature.com/test.txt", header=TRUE)) # read from web site

Note that the read.table command can also be used for reading data from the computer disk by providing an appropriate path in inverted commas such as

data <-read.table("D:/data.txt", header=TRUE)) # read from your computer

For missing data, read.table will not work and you will receive an error. For missing values the easiest way to fix this error, change the type of delimiter by using a sep argument to specify the delimiter.

data <-read.table("http//itfeature.com/missing_comma.txt", header=TRUE, sep=","))

Comma-delimited files can be read in by read.table function and sep argument, but they can also be read in by the read.csv function specifically written for comma-delimited files. To display the contents of the file use print() function or file name.

data <- read.csv(file=file.choose() )

Reading in fixed formatted files

To read data in fixed format use read.fwf function and argument width are used to indicate the width (number of columns) for each variable. In this format variable names are not there in the first line, therefore they must be added after reading the data. Variable names are added by dimnames function and the bracket notation to indicate that we are attaching names to the variables (columns) of the data file. Anyhow there are several different ways to do this task.

data <- read.fwf("http://itfeature.com/test_fixed.txt", width = c(8,1,3,1,1,1) )

dimnames(data)[[2]]
c("v1", "v2", "v3", "v4", "v5","v6")

Import Data In R

Importing data in R is fairly simple. For Stata and Systat, use the foreign package. For SPSS and SAS recommended package is the Hmisc package for ease and functionality. See the Quick-R section on packages, for information on obtaining and installing these packages. Examples of importing data in R are provided below.

From Excel

On Windows systems, you can use the RODBC package to access Excel files. The first row of the Excel file should contain variable/column names.

# Excel file name is myexcel and WorkSheet name is mysheet
library(RODBC)
channel <- odbcConnectExcel("c:/myexel.xls")
mydata <- sqlFetch(channel, "mysheet") 
odbcClose(channel)

From SPSS

# First save SPSS dataset in trasport format
get file = 'c:\data.sav'
export outfile = 'c:\data.por'
library(Hmisc)
mydata <- spss.get("c:/data.por", use.value.labels=TRUE)   
# "use.value.labels" option converts value labels to R factors.

From SAS

# save SAS dataset in trasport format
libname out xport 'c:/mydata.xpt';
data out.data;
set sasuser.data;
run;
# in R
library(Hmisc)
mydata &lt;- sasxport.get("c:/data.xpt")
# character variables are converted to R factors
From Stata
# input Stata file
library(foreign)
mydata &lt;- read.dta("c:/data.dta")
From systat
# input Systat file
library(foreign)
mydata &lt;- read.systat("c:/mydata.dta")
Importing Data in R

Accessing Data in R Library

Many of the R libraries including CAR library contain data sets. For example to access the Duncan data frame from the CAR library in R type the following command on R Console

library(car)
data(Duncan)
attach(Duncan)

Some Important Commands for Dataframes

data        #displays the entire data set on command editor
head(data)  #displays the first 6 rows of dataframe
tail(data)  #displays the last 6 rows of dataframe
str(data)   #displays the names of variable and their types
names(data) #shows the variable names only
rename(V1,Variable1, dataFrame=data) # renames V1 to variable 1; note that epicalc packagemust be installed
ls()        #shows a list of objects that are available
attach(data)#attached the dataframe to the R search path, which makes it easy to access variables names.

https://gmstat.com

https://itfeature.com

Import Data Files Into R

You can read/import data files into R that are produced by different software such as MS Excel, Minitab, SPSS, SAS, and STATA.

Import Data Files into R Language

In R language different data file formats can be read or imported easily. Let us start learning about how to import Data Files into R Language such as Excel, SPSS, STATA, SAS, Minitab, etc format.

Reading Excel Files in R

The XLConnect package can be used to import data files into R produced in MS Excel. First, you have to install this package.

install.packages("XLConnect")

You can check if the package is already installed using the code:

any(grepl("XLConnect", installed.packages()))

If the package is installed, you need to activate the package in your workspace (using library( ) function) to load data from MS Excel files.

importing Data Files into R Language

Suppose, you have a data file (Hald.xlsx) stored at the path "D:\STAT\STA-654\Hald.xlsx". To read this file the readWorksheetFromFile( ) function can be used. For example,

library(XLConnect) 
data &lt;- readWorksheetFromFile("D:/stat/sta-654/Hald.xlsx", sheet=1)

Since an Excel workbook can contain more than one sheet, therefore, you need to specify the sheet argument and specify which sheet you want to load into R. In this example, data from sheet 1 will be loaded.

If you want to load the whole workbook (all sheets) use the loadWorkbook( ) function to load the required worksheet as a data file frame. For example,

wb <- loadWorkbook("D:/stat/sta-654/Hald.xlsx") 
df <- readWorksheet(wb, sheet = 1)

The readxl package can also be used to read MS Excel files more easily.

library(readxl) 
df <- read_excel("Data file with path")

Note that the MS Excel files with extension: *.xls, or *.xlsx can be specified. The sheet argument can also be added, just like with the XLSconnect package.

Read SPSS Data Files

To read the SPSS files install foreign package. After loading the foreign package, the read.spss( ) function can be used to load an SPSS data file in R. For example,

library(foreign) 
mydata &lt;- read.spss ("SPSS data file with path", to.data.frame = TRUE)

The argument to.data.frame is set to TRUE so that the data is displayed in a data frame format. Since the SPSS data file contains value labels, and if you do not want the variables with value labels to be converted into R factors with corresponding levels, setuse.value.labels = FALSE. For example,

library(foreign) 

mydata <- read.spss("SPSS data file with path", 
                    to.data.frame = TRUE, 
                    use.value.labels = FALSE)

Reading STATA Data Files

To import Stata files the read.dta( ) function from the foreign package can be used. For example,

library(foreign) 
mydata &lt;- read.dta("STATA file with path")

Reading SAS Data Files

The read.sas7bdat( ) function from sas7bdat package can be used to read SAS data files into R.

library(sas7bdat) 
mydata &lt;- read.sas7bdat("SAS data file with path")

Reading Minitab Data File

Minitab data files can be imported in R using read.mtp( ) function from foreign package.

library(foreign) 
mydata &lt;- read.mtp("Minitab data file with path")

Reading RDA or RData Files

The R data files RDA and RData files can be easily read using load( ) function.

load("filename.RDA")

https://itfeature.com

https://gmstat.com

Reading Text Files In R Language

We can import data that is already saved (available) in a file created in text (*.txt) files, MS Excel, SPSS, or some other software. Before importing/reading data stored in a file (that is, reading text files in R), one should be clear and understand the following:

  1. Usually, data from spreadsheets reserved the first row as header (name of variables), while the first column was used to identify the sampling unit (observation number).
  2. Avoid names, and the value of fields with blank spaces, each word may be interpreted as a separate variable, resulting in errors.
  3. To concatenate words, use a full stop (.) instead of space between words.
  4. Name variables with short or abbreviated names.
  5. Try to avoid using names of variables that contain symbols such as ?, $, %, ^, *, (, ), -, #, <, >, /, |, ,\, [, ], {, and }.
  6. Delete comments you have made in your Excel file.
  7. Make sure missing values in your dataset are indicated with NA.

Preparing R workspace

Before importing data in R, it is better to delete all objects using the following line of code

rm(list = ls() )

The rm( ) function “remove objects from a specified environment”. Since no argument to ls( ) function is provided, datasets and user-defined functions will be deleted.

Confirm your working directory before importing a file to R, using

getwd()

If possible change the path of your working directory. such as

setwd("D:\\Stat\\STA-654")

Note you may have to create the directory (folder) and the path discussed above.

Reading Text Files in R

Reading Text Files In R Language

Reading Text files in R is easy and simple enough. If you have data in a *.txt file or a tab-delimited text file, you can easily import it with the read.table( ) function. Suppose we have a data file named "Hald.txt" stored at the path "D:\STAT\STA-654\Hald.txt". The following code line can be used for reading text files in R:

datafile <- read.table ("D:/stat/sta-654/Hald.txt", header = TRUE)

If you have data stored on some web address, you can also import it as

datafile <- read.table ("http://itfeature.com/wp-content/uploads/2020/03/Hald.txt", header = TRUE)

Note that the first argument of read.table() provide the name and extension of the file that you want to import in R. The header argument specifies whether or not you have specified column names in your data file. The Hald.txt file will be imported as data.frame an object.

Computer MCQs Online Test

MCQs in Statistics