--- title: "How to extend connector" output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{How to extend connector} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r setup, include = FALSE} knitr::opts_chunk$set( collapse = TRUE, comment = "#>" ) # Setting up a tmp folder to use as the working directory for # the vignette. Also create a folder used in examples below. tmp <- withr::local_tempdir() dir.create(path = file.path(tmp, "my_root_path", "my_project"), recursive = TRUE) library(connector) ``` `connector` is build to be easily extensible. This vignette will show you how to create your own connector classes, and how to customize the existing ones, and their methods. The basic `Connector` class is a simple `{R6}` object, with no additional functionality. It serves as the foundation for all other connectors, that should inherit from it in their `R6::R6Class()` definition - either directly or indirectly. Both `ConnectorFS` and `ConnectorDBI` inherit from `Connector` in this way. ## Creating a new connector We can create a new connector by creating a new R6 class that inherits from `Connector`: ```{r} connector_myclass <- R6::R6Class( "connector_myclass", inherit = Connector ) connector_myclass$new() ``` This is the simplest type of inheritance, and you should note that the `Connector` parent class has no methods capable of e.g. reading (`read_cnt()`) or writing (`write_cnt()`) data. It only has default methods that throws meaningful errors if you have not defined the method (e.g. `write_cnt.my_class()`) for your new connector class. In most cases you want to inherit from either `ConnectorFS` or `ConnectorDBI` depending on if your new connector is used to access files or databases respectively. Below we create a new `connector_project` class that inherits and acts exactly as the `ConnectorFS`, but instead of the user having to provide the `path` to the folder as an argument, they provide the `project` name, and the `path` is constructed from that. ```{r} connector_project <- R6::R6Class( "connector_project", inherit = ConnectorFS, public = list( initialize = function(project) { private$.project <- project path <- file.path(tmp, "my_root_path", project) super$initialize(path) } ), private = list( .project = NULL ), active = list( project = function() { private$.project } ) ) ``` This way of extending connector could e.g. be relevant inside an organisation where all projects are stored in a common folder structure. When we not initialize a `connector_project` you can see that it still has all the methods from `ConnectorFS`, and that the path has been assigned correctly based on the `project` argument: ```{r} my_project <- connector_project$new(project = "my_project") print(my_project) ``` We can now use this `connector` to read and write data, just as we would with `ConnectorFS`: ```{r} # First list current content: my_project |> list_content_cnt() # Write some content: my_project |> write_cnt("Hello world!", "my_file.txt") # List content again: my_project |> list_content_cnt() # Read the content: my_project |> read_cnt("my_file.txt") ``` ## Create custom generic method All `connector`generics such as `list_content_cnt()`, `read_cnt()`, and `write_cnt()` are S3 generics. This means that you can create custom methods for your new connector class, and they will be used when the generic is called, instead of the one associated with the parent class. To illustrate this we can take a look at the `list_content_cnt()` generic: ```{r} # Print the generic print(list_content_cnt) # List the registered s3 methods methods("list_content_cnt") |> cat(sep = "\n") ``` Building further on the example below we can define a custom method for `list_content_cnt()` for the `connector_project` class, to be used instead of `list_content_cnt.ConnectorFS`: ```{r} list_content_cnt.connector_project <- function(connector_object, ...) { cli::cli_alert("Listing content of {connector_object$project}") NextMethod() } ``` This is of course a very simple example, that just prints a message before calling the `ConnectorFS` method. We can now see that this method is available and that is it associated with the `connector_project` class: ```{r} # List methods again methods("list_content_cnt") |> cat(sep = "\n") # Print my_project connector to see associated methods print(my_project) ``` And when we use `list_content_cnt()` on our `my_project` object, we see that the custom method is used and we get the message: ```{r} my_project |> list_content_cnt() ``` ## Use extra class for simple customization If you as above just want to slightly tweak the behavior of an existing functionality an alternative solution is to use the `extra_class` argument when initializing of the connector. This argument adds the `extra_class` as the first class of the creating the connector, meaning that for any generic dispatch, such as of `list_content_cnt()`, a method for this class will be used before any of the connector classes. To redo the two examples above we make a new `ConnectorFS` with the extra class `demo_extra`: ```{r} my_project_extra <- ConnectorFS$new( path = file.path(tmp, "my_root_path", "my_project"), extra_class = "my_extra_class" ) print(my_project_extra) ``` As you can see here we have all the methods from `ConnectorFS`, but the `demo_extra` is now the first class in the class hierarchy. To create a custom method for `list_content_cnt()` for the `demo_extra` we do the same as for the `connector_project` above: ```{r} list_content_cnt.my_extra_class <- function(connector_object, ...) { cli::cli_alert("Listing content of {connector_object$path}") NextMethod() } ``` The project information is of course not available now, so we just print the path instead, but otherwise everything is the same: ```{r} # List methods methods("list_content_cnt") # Print my_project_extra connector to see associated methods print(my_project_extra) # List content to see the new message my_project_extra |> list_content_cnt() ``` ## Special handling of files A special property of file storage connectors (inheriting from `ConnectorFS`) is that they are operating on files not on databases. This means that they can handle multiple file formats, and also not only file formats for reading and writing rectangular data. When handling files the user will only use the `read_cnt()` and `write_cnt()` generics, but behind the scenes the following chain of functions are called: * `read_cnt()` --> `read_file()` --> `read_ext()` --> External read function * `write_cnt()` --> `write_file()` --> `write_ext()` --> External write function Here `read_cnt()` dispatches based on the class of the `Connector` object. For the file storage connectors (inheriting from `ConnectorFS`) the `read_file()` method is then called on the path of the file. `read_file()` here is only a helper function, that then calls `read_ext()` which is a generic that dispatches based on file extension of the file, and uses general functions from other packages to read the file. As an example any file with the extension `.csv` will be read using `readr::read_csv()`. The same logic applies to `write_cnt()`. ### Add new file format The currently supported file types can be seen in reference of the `read_cnt()` and `write_cnt()` functions respectively. But let us imagine we want to support a new imaginary file format `myformat`. In order to this all we need to do is to create the appropriate `read_ext()` and `write_ext()` methods: ```{r} read_ext.myformat <- function(path, ...) { cli::cli_alert("Reading myformat file") readLines(con = path) } write_ext.myformat <- function(file, x, ...) { cli::cli_alert("Writing myformat file") writeLines(text = x, con = file) } ``` And we can now use them to write and read our new file format with out existing `my_project` connector: ```{r} # List already existing content: my_project |> list_content_cnt() # Write some content in myformat: my_project |> write_cnt("Hello new format!", "new_file.myformat") # List content again: my_project |> list_content_cnt() # Read the content: my_project |> read_cnt("new_file.myformat") ```