nre_ire has been removed. It required fully manual
updates and had no automated pipeline support; it will be reconsidered
for a future release.property_records has been removed because the upstream
data source is no longer available.cbic has been removed. The upstream CBIC portal
migrated to a restricted-access platform. The five cement tables will be
rebuilt from IBGE open data in a future release.itbi_summary and the internal ITBI helpers
(get_itbi, get_itbi_bhe) have been removed.
They were incomplete (single-municipality coverage) and are deferred to
a future version.bcb_seriesget_dataset("bcb_series") now returns only four
columns: date, code_bcb,
name_simplified, and value. Full metadata is
available via bcb_metadata.table argument now accepts a hierarchy level:
"core" (default), "primary",
"secondary", "tertiary", or
"full". The levels are cumulative — "primary"
includes all "core" series plus key macro indicators such
as SELIC, IPCA, and INCC. Previously the argument accepted BCB category
names ("credit", "price", etc.).bcb_metadata gains a hierarchy column
(integer 1–4) that records the relevance tier assigned to each
series.rppiget_dataset("rppi", table = "all") now returns two
additional columns: transaction_type ("sale"
or "rent") and source (the index name, e.g.,
"IGMI-R", "IVG-R", "FipeZap").
Previously the stacked table had no way to distinguish transaction type
from index source.Sheets not found errors after FIPE added a new
summary sheet. Sheet selection now uses numeric indices to avoid
Latin-1/UTF-8 mismatches in accented sheet names.get_range() failing with a malformed cell range
when tidyxl received a sheet name with a mismatched
encoding attribute. The function now derives the sheet lookup key
directly from the cells returned by tidyxl rather than from
the user-supplied string.fetch_fgv_local(), which was inadvertently
dropped during earlier refactoring but is still called by the
targets pipeline to process the manually-maintained FGV
IBRE CSV export.download_* /
clean_* naming convention: download_*
functions return a file path; clean_* functions parse and
tidy that path into a tibble.tryCatch with
rlang::try_fetch throughout, using
parent = cnd to preserve the original error chain.validate_dataset_params(),
handle_dataset_cache(),
attach_dataset_metadata(), and
validate_dataset() from R/helpers-dataset.R
and R/helpers-download.R instead of re-implementing these
patterns inline..github/workflows/update_data_weekly.yml. The workflow was
silently skipping abecip and abrainc targets
on every automated run; those entries now use the current granular
target names (abecip_sbpe_data,
abecip_units_data, abrainc_indicator_data,
etc.).fgv_ibre_file to the weekly and all
target groups so the FGV file-change target is included in scheduled
runs.bcb_series table reference in
vignettes/getting-started.Rmd to use the new hierarchy
levels ("core", "primary",
"secondary", "tertiary", "full")
instead of the removed category names.rppi_bis table listing to include
detailed_annual and detailed_halfyearly, which
were previously omitted.get_secovi.RURL field to DESCRIPTION@source tag for dim_city
dataset documentationb3_real_estate documentation (“mian”
-> “main”)skip_on_cran() definition that shadowed
testthatVersion 0.6.0 introduces an intelligent cache freshness detection system with relaxed defaults to avoid annoying users with unnecessary warnings.
get_cache_age(): Returns cache age in
days for any datasetis_cache_stale(): Checks if cache
exceeds recommended freshness thresholdscheck_cache_status(): Diagnostic
function showing status of all cached datasetsCache warnings only appear when data is significantly stale (exceeds 2x the update frequency): - Weekly datasets: warn after 14 days (not 7) - Monthly datasets: warn after 60 days (not 30) - Manual datasets: never warn
max_age parameter in
get_dataset(): Force fresh download if cache exceeds
specified ageAll datasets in inst/extdata/datasets.yaml now include:
- update_schedule: “weekly”, “monthly”, or “manual” -
warn_after_days: Custom threshold for staleness warnings
(NULL for manual datasets)
# Check status of all cached datasets
check_cache_status()
# Get age of specific dataset
get_cache_age("bcb_series")
# Check if dataset is stale (uses relaxed defaults from registry)
is_cache_stale("bcb_series")
# Advanced: Force very fresh data (< 1 day old)
get_dataset("bcb_series", max_age = 1)
# Advanced: Only use cache if less than 3 days old
get_dataset("rppi", table = "sale", max_age = 3)Version 0.6.0 introduces 7 generic helper functions that consolidate 890 lines of repetitive code patterns across dataset functions.
| File | Before | After | Lines Saved | % Reduction |
|---|---|---|---|---|
| get_abecip_indicators.R | 551 | 431 | 120 | 21.8% |
| get_abrainc_indicators.R | 544 | 445 | 99 | 18.2% |
| get_secovi.R | 438 | 356 | 82 | 18.7% |
| get_bcb_series.R | 334 | 278 | 56 | 16.8% |
| get-dataset.R | 833 | 773 | 60 | 7.2% |
| TOTAL | 2,700 | 2,283 | 417 | 15.4% |
See .claude/phase3_completion_summary.md for complete
details.
Version 0.6.0 removes 8 deprecated functions from the public API. These functions are now internal-only.
Removed from NAMESPACE: 8 deprecated functions no
longer exported: - get_abecip_indicators() -
get_abrainc_indicators() -
get_bcb_realestate() - get_bcb_series() -
get_fgv_ibre() - get_nre_ire() -
get_rppi_bis() - get_secovi()
get_dataset()These functions were deprecated in v0.4.0 (18+ months ago). Users
must now use get_dataset():
# Old way (NO LONGER WORKS):
data <- get_secovi()
data <- get_bcb_series(table = "price")
data <- get_abecip_indicators(table = "sbpe")
# New way (REQUIRED):
data <- get_dataset("secovi")
data <- get_dataset("bcb_series", "price")
data <- get_dataset("abecip", "sbpe")get_dataset()) instead of 15+get_from_legacy_function() →
get_from_internal_function()Files changed: R/get-dataset.R
suppress_external_warnings() - Never calledexplore_cbic_structure() - Only in examplesget_cbic_files() - Only in examplesget_cbic_materials() - Only in examplesget_cbic_steel() and get_cbic_pim():
attr(result, "source")attr(result, "download_time")attr(result, "download_info")steel_prices and
pim tables now accessiblesteel_production remains
blocked (has data quality issues)Files changed: R/get_cbic.R
Version 0.6.0 removes usage examples from deprecated legacy functions to simplify the codebase. Since we are pre-1.0.0, this is an acceptable breaking change.
@examples blocks from 8
deprecated functions:
get_secovi()get_bcb_realestate()get_abrainc_indicators()get_abecip_indicators()get_rppi_bis()get_bcb_series()get_fgv_ibre()get_nre_ire()@section blocks
(Progress Reporting, Error Handling)@details sections to 1-3
essential lines@section Deprecation blocks
with code migration examplesget_dataset()
insteadThese functions were deprecated in v0.4.0. Users should migrate to the modern API:
# Old way (still works, but no longer documented with examples):
data <- get_secovi()
# New way (recommended):
data <- get_dataset("secovi")Full migration examples are available in each function’s
@section Deprecation block.
get_dataset() interfaceFixed SECOVI dataset to return all categories by default instead of only “condo”
Problem: get_dataset("secovi") was
only returning the “condo” category (1,939 rows) instead of all
categories (9,398 rows). This caused test failures for launch/rent/sale
tables.
Root Cause: When no table parameter was specified, the code defaulted to the first category alphabetically (“condo”), rather than fetching all categories.
Solution:
default_table configuration support in
datasets.yamlvalidate_and_resolve_table() to check for
default_table settingdefault_table: "all" in registryImpact:
# Now returns all categories by default
get_dataset("secovi") # → 9,398 rows, 4 categories ✅
# Specific tables still work correctly
get_dataset("secovi", "launch") # → 780 rows
get_dataset("secovi", "rent") # → 2,779 rows
get_dataset("secovi", "sale") # → 3,900 rowsdevtools::load_all() instead
of library() to ensure testing of development versiontests/comprehensive_check_v0.5.qmd)tests/TEST_RESULTS_SUMMARY.md,
tests/QUICK_SUMMARY.md)_targets.R to always load development version
for consistencyVersion 0.5.0 introduces user-level caching, removing bundled datasets from the package to comply with CRAN’s 5MB size limit. This is a BREAKING CHANGE that affects how datasets are accessed.
inst/cached_data/ (previously ~25MB)~/.local/share/realestatebr/ (Linux/Mac) or
%LOCALAPPDATA%/realestatebr/Cache/ (Windows)source="cache" now refers to
user cache, not package cachesource="github" now downloads
from GitHub releases, not package files# First use: downloads from GitHub releases to user cache
data <- get_dataset("abecip") # Downloads once
# Subsequent uses: loads from user cache (instant, offline)
data <- get_dataset("abecip") # Loads from ~/.local/share/realestatebr/
# Force fresh download from original source
data <- get_dataset("abecip", source = "fresh") # Downloads and caches
# Explicit source selection
data <- get_dataset("abecip", source = "cache") # User cache only
data <- get_dataset("abecip", source = "github") # GitHub releases only~/.local/share/realestatebr/ (instant, offline)piggyback package)rappdirs (Imports) -
Cross-platform user cache directory supportpiggyback (Suggests) - GitHub
releases download supportget_user_cache_dir(): Get path to user cache
directorylist_cached_files(): List all cached datasetsclear_user_cache(): Remove cached datasetsis_cached(): Check if dataset is in cachelist_github_assets(): List available datasets on GitHub
releasesdownload_from_github_release(): Download specific
dataset from releasesupdate_cache_from_github(): Update cached datasets from
GitHubis_cache_up_to_date(): Compare local vs GitHub cache
timestamps# Install updated package
install.packages("realestatebr") # or devtools::install_github()
# Install piggyback for GitHub downloads (recommended)
install.packages("piggyback")
# First use after update: will download datasets to user cache
data <- get_dataset("abecip")
# Check cache location
get_user_cache_dir()
# Manage cache
list_cached_files() # See what's cached
clear_user_cache("abecip") # Clear specific dataset
clear_user_cache() # Clear all (with confirmation).Rbuildignoreinst/cached_data/ kept for development/CI but excluded
from distributiondata-raw/publish-cache.Rget_dataset()
interface unchangedimport_cached(): Still works but now loads from user
cache (previously from inst/)cached=TRUE parameter in legacy functions: Still
supported but uses new cacheR/cache-user.R - User cache
managementR/cache-github.R - GitHub
releases integrationdata-raw/publish-cache.R - Upload
cache to releasesR/get-dataset.R - Refactored
cache logicR/cache.R - Marked as
deprecated (kept for compatibility).Rbuildignore - Exclude
inst/cached_data/ filesDESCRIPTION - Added
rappdirs and piggyback dependenciessource="fresh" to
source="github" for manually-updated datasetsget_fgv_ibre() and get_nre_ire()
fgv_data and
ire objects from R/sysdata.rdacached=FALSEmanual_update flag to
datasets.yaml for FGV IBRE and NRE-IREupdate_notes field documenting
why fresh downloads aren’t available_targets.R explaining data source choices_targets.R: Updated fetch_dataset() to
support source parameter; FGV and NRE-IRE now use
source="github"R/get_fgv_ibre.R: Removed broken internal data
fallback; added clear error for fresh downloadsR/get_nre_ire.R: Removed broken internal data fallback;
added clear error for fresh downloadsinst/extdata/datasets.yaml: Added manual update flags
and notesget_property_records.R (14% code reduction: 780→673
lines)get_ri_capitals() and get_ri_aggregates() with
warning messagessource, download_time,
download_info) that were never usedscrape_registro_imoveis_links() with better connection
cleanup and reduced complexitynrow() before CLI
interpolation to avoid closure issuespurrr::possibly() patternbcb_category when table
specifiedbcb_metadata dynamically (now downloads all 140 series, not
just 15)get-dataset.Rbcb_categorytable="all" in
validate_and_resolve_table() functionbcb_series categories in
datasets.yaml to match metadata.envir = parent.frame()
to cli::cli_inform() calls in cli_user() and
cli_debug()standardize_city_names()
call after binding FipeZap dataproperty_records structure in
get-dataset.Rget_dataset() functionalitysource="fresh" to
catch real-world failures before productiontests/basic_checks.R for developmenteval=FALSE for faster
developmentget_dataset("rppi", "ivgr") and
other individual RPPI tables now work correctlyget_rppi() function
now supports all individual RPPI tables (fipezap, ivgr, igmi, iqa,
iqaiw, ivar, secovi_sp) in addition to stacked tables (sale, rent,
all)get_bcb_realestate.R,
get_cbic.R, get_fgv_ibre.R,
get_property_records.R, get_rppi.R,
get_rppi_bis.R, get_secovi.Rcategory=
parameter to table= in
tests/sanity_check.RThis release implements a major breaking change that
consolidates 15+ individual get_*() functions into a
single, unified get_dataset() interface. This dramatically
simplifies the package API while maintaining full functionality.
BREAKING CHANGE: All individual get_*()
functions have been removed: - get_abecip_indicators(),
get_abrainc_indicators(), get_rppi(),
get_bcb_realestate(), etc. - Migration:
Use get_dataset("dataset_name") instead
Major refactoring of RPPI functions for better
maintainability: - 67% code reduction: 1579 lines → 519
lines (1060 lines removed) - Bug fix: FipeZap national
index now correctly standardized to name_muni == "Brazil" -
Shared helpers: Created rppi-helpers.R
with common functions to eliminate duplication - Removed
overhead: Eliminated unused stack parameter,
cli_debug calls, and metadata attributes - Simplified
documentation: Removed verbose sections (Progress Reporting,
Error Handling, Examples) from internal functions - All
functions now @keywords internal: Only
get_dataset() is user-facing
Benefits: - Easier to maintain and debug - Faster execution (less overhead) - Consistent error handling across all indices - Bug fixes apply to all functions automatically
Note: In v0.4.0, the CBIC dataset is limited to cement tables only (validated data). Steel and PIM tables will be added in v0.4.1.
Available in v0.4.0: - ✅
cement_monthly_consumption - Monthly cement consumption by
state - ✅ cement_annual_consumption - Annual cement
consumption by region - ✅ cement_production_exports -
Production, consumption, and export data - ✅
cement_monthly_production - Monthly cement production by
state - ✅ cement_cub_prices - CUB cement prices by
state
Coming in v0.4.1: - ⏳ Steel prices and production data - ⏳ PIM industrial production indices
# Works in v0.4.0
get_dataset("cbic") # Default: cement_monthly_consumption
get_dataset("cbic", "cement_cub_prices")
# Will error with informative message
get_dataset("cbic", "steel_prices") # Deferred to v0.4.1fetch_*() functions with
@keywords internalinst/extdata/datasets.yamlrppi
and rppi_indices into single hierarchical structuretable, cached,
quiet, max_retriesNew unified interface:
# Get data from any dataset
data <- get_dataset("abecip") # Default table
data <- get_dataset("abecip", table = "sbpe") # Specific table
data <- get_dataset("rppi", table = "fipezap") # Hierarchical access
# Discover datasets
datasets <- list_datasets()
info <- get_dataset_info("rppi")Removed functions (now internal): -
get_abecip_indicators() →
get_dataset("abecip") -
get_abrainc_indicators() →
get_dataset("abrainc") - get_rppi() →
get_dataset("rppi") - get_bcb_realestate() →
get_dataset("bcb_realestate") -
get_bcb_series() → get_dataset("bcb_series") -
Plus 10 more functions
source = "cache"/"github"/"fresh" optionstest-internal-functions-0.4.0.R with 100 tests# OLD (0.3.x) - Will no longer work
data <- get_abecip_indicators(table = "sbpe")
data <- get_rppi(table = "fipezap")
data <- get_bcb_realestate(table = "all")
# NEW (0.4.0) - Required migration
data <- get_dataset("abecip", table = "sbpe")
data <- get_dataset("rppi", table = "fipezap")
data <- get_dataset("bcb_realestate", table = "all")| Old Function | New get_dataset() Name |
|---|---|
get_abecip_indicators() |
"abecip" |
get_abrainc_indicators() |
"abrainc" |
get_rppi() |
"rppi" |
get_bcb_realestate() |
"bcb_realestate" |
get_bcb_series() |
"bcb_series" |
get_rppi_bis() |
"rppi_bis" |
get_secovi() |
"secovi" |
get_fgv_indicators() |
"fgv_indicators" |
get_b3_stocks() |
"b3_stocks" |
get_nre_ire() |
"nre_ire" |
get_cbic_*() |
"cbic" |
get_itbi() |
"itbi" |
get_property_records() |
"registro" |
# OLD - Multiple functions
fipezap <- get_rppi_fipezap()
igmi <- get_rppi_igmi()
bis <- get_rppi_bis()
# NEW - Unified hierarchical access
fipezap <- get_dataset("rppi", table = "fipezap")
igmi <- get_dataset("rppi", table = "igmi")
bis <- get_dataset("rppi", table = "bis")fetch_rppi(), fetch_abecip(), etc.datasets.yamlget_from_internal_function() →
get_from_legacy_function()get_dataset(), list_datasets(), utilitiesThis release represents a major architectural shift toward a unified, maintainable API. While it introduces breaking changes, the new interface is significantly simpler and more powerful.
Full Changelog: https://github.com/viniciusoike/realestatebr/compare/v0.3.0…v0.4.0
_targets.R workflow
with automated dependency management and parallel processingcache.R with better fallback mechanismsget_abrainc_indicators() (category → table)get_nre_ire() to
use internal package data directlysysdata.rda
with latest processed datasetstargets and
tarchetypes to package dependenciesThis release establishes the foundation for automated data processing and validation, setting the stage for Phase 3 implementation with large dataset support.
Full Changelog: https://github.com/viniciusoike/realestatebr/compare/v0.2.0…v0.3.0
get_* functions
with consistent APIs, CLI-based error handling, and progress
reportingtable, cached, quiet, and
max_retries parameterslist_datasets() - Discover available
datasets with filtering capabilitiesget_dataset() - Unified data access
function with intelligent fallbackinst/extdata/datasets.yaml for centralized dataset
managementtable parameter replacing
category across all functionscategory parametertable = "all"get_cbic_cement() - Cement consumption, production, and
CUB pricesget_cbic_steel() - Steel prices and production
dataget_cbic_pim() - Industrial production indicescli package
integration for long-running operationscategory parameter deprecated across
all functions in favor of table
category = "value" with
table = "value"cached_data/ to
inst/cached_data/ for package complianceget_abecip_indicators() - ABECIP real estate financing
dataget_abrainc_indicators() - ABRAINC launches and sales
dataget_b3_stocks() - B3 stock market data with improved
column namingget_bcb_realestate() - Central Bank real estate credit
dataget_bcb_series() - BCB macroeconomic time seriesget_rppi_bis() - Bank for International Settlements
RPPI dataget_cbic_cement() - CBIC cement industry data
(NEW)get_cbic_steel() - CBIC steel industry data (NEW)get_cbic_pim() - CBIC industrial production data
(NEW)get_fgv_indicators() - FGV construction confidence
indicatorsget_nre_ire() - Real Estate Index from NRE-Poli
USPget_property_records() - Property registration data
with robust Excel processingget_rppi() - Comprehensive RPPI coordinator with all
sourcesget_secovi() - SECOVI-SP real estate data with parallel
processingget_rppi_bis() - Main function with modernized backend
and single tibble returnsget_itbi() and get_itbi_bhe() - Planned
for Phase 3 (DuckDB integration)devtools integration# Old (deprecated but still works)
data <- get_abecip_indicators(category = "all")
# New (recommended)
data <- get_abecip_indicators(table = "all")# Discover available datasets
datasets <- list_datasets()
# Get data with unified interface
data <- get_dataset("abecip_indicators")
# Use modernized functions with progress
data <- get_abecip_indicators(table = "indicators", quiet = FALSE)cli for modern error handling
and progress reportingdplyr, readr, httr, and
rvestThis release represents the completion of Phase 1 modernization, establishing a solid foundation for Phase 2 (data pipeline automation) and Phase 3 (large dataset support with DuckDB).
Full Changelog: https://github.com/viniciusoike/realestatebr/compare/v0.1.5…v0.2.0