Getting Started

2017-08-21

A (Very) Quick Start

To begin, make sure civis is installed and your API key is in your R environment. You can quickly test that civis is working by invoking

name <- civis::users_list_me()$name
paste(name, "is really awesome!")

If civis is working, you’ll see a friendly message. Otherwise, you might see an error like this when civis wasn’t installed properly:

Error in loadNamespace(name) : there is no package called 'civis'

or like this if you haven’t set your API key correctly:

Error in api_key() : The environmental variable CIVIS_API_KEY is not set. Add this to your .Renviron or call Sys.setenv(CIVIS_API_KEY = '<api_key>')

With civis, moving data to and from the cloud takes only a few lines of code. Your data can be stored as rows in a table, CSVs on remote file system or even as serialized R objects like nested lists. For example,

library(civis)

# First we'll load a dataframe of the famous iris dataset
data(iris)

# We'll set a default database and define the table where want to
# store our data
options(civis.default_db = "my_database")
iris_tablename <- "my_schema.my_table"

# Next we'll push the data to the database table
write_civis(iris, iris_tablename)

# Great, now let's read it back
df <- read_civis(iris_tablename)

# Hmmm, I'm more partial to setosa myself. Let's write a custom sql query.
# We'll need to wrap our query string in `sql` to let read_civis know we
# are passing in a sql command rather than a tablename.
query_str <- paste("SELECT * FROM", iris_tablename, "WHERE Species = 'setosa'")
iris_setosa <- read_civis(sql(query_str))

# Now let's store this data along with a note as a serialized R object
# on a remote file system. We could store any object remotely this way.
data <- list(data = iris_setosa, special_note = "The best iris species")
file_id <- write_civis_file(data)

# Finally, let's read back our data from the remote file system.
data2 <- read_civis(file_id)
data2[["special_note"]]

## [1] "The best iris species"

civis also includes functionality for working with CivisML, Civis’ machine learning ecosystem. With the combined power of CivisML and civis, you can build models in the cloud where the models can use as much memory as they need and there’s no chance of your laptop crashing.

library(civis)

# It really is a great dataset
data(iris)

# Gradient boosting or random forest, who will win?
gb_model <- civis_ml_gradient_boosting_classifier(iris, "Species")
rf_model <- civis_ml_random_forest_classifier(iris, "Species")
macroavgs <- list(gb_model = gb_model$metrics$metrics$roc_auc_macroavg,
                  rf_model = rf_model$metrics$metrics$roc_auc_macroavg)
macroavgs

## $gb_model
## [1] 0.9945333
## 
## $rf_model
## [1] 0.9954667

For a comprehensive list of functions in civis, see Reference in the full documentation. The full documentation also includes a set of Articles for detailed documentation on common workflows, including data manipulation and building models in parallel.