easyScieloPak R badge

easyScieloPak is an R package that allows you to search and access academic articles from SciELO programmatically.

Objective

The main goal of easyScieloPak is to simplify the process of querying SciELO from R by: - Making queries readable and reproducible. - Allowing filters like year, collection (country), language, journal, and subject category. - Handling pagination, data parsing, and cleaning automatically. - Providing clear and validated feedback when a query is incorrect. - Minimizing errors due to anti-scraping measures (e.g., 403 HTTP errors).

Features

Installation

You can install the development version of easyScieloPak from GitHub using either devtools or remotes:

Using devtools

install.packages(“devtools”) devtools::install_github(“https://github.com/PabloIxcamparij/easyScieloPack.git”)

Or using remotes

install.packages(“remotes”) remotes::install_github(“https://github.com/PabloIxcamparij/easyScieloPack.git”)

library(easyScieloPak)

Create a query

library(easyScieloPak)

df <- search_scielo(“salud ambiental”, collections = “Ecuador”, languages = “es”, n_max = 5) head(df)

df <- search_scielo(“ecology”, collections = “Chile”, languages = “en”, n_max = 8)

View(df) # View results in RStudio

Current Limitations

-Default fallback limit: If the total number of available results cannot be determined, the query will default to fetching a maximum of 100 articles.

Recent Improvements -Rotating User-Agents: Each request uses a different User-Agent string (Chrome, Firefox, Safari variants) to appear more like a real browser and avoid blocking.

-Random delays between requests reduce server load and minimize scraping detection.

-Retry logic: If a request fails, the package retries automatically with a different User-Agent.

Planned Improvements

The current version of easyScieloPak is fully functional for basic academic exploration through SciELO. However, the following enhancements are planned for future versions:

About SciELO

SciELO is a multidisciplinary open-access platform hosting scientific journals from over 15 countries. It plays a vital role in disseminating research output from Latin America and beyond.

This package provides a lightweight, unofficial method to interact with SciELO’s search interface.

Disclaimer

Contributing

Feel free to open issues or submit pull requests to improve functionality, usability, or documentation.