Fortran library providing S3-compatible object storage access with familiar I/O interfaces
fortran-s3-accessor
provides a simple, direct interface for accessing S3-compatible object storage from Fortran programs. The library is designed for scientific computing workflows that need to read and write data to cloud object storage with minimal friction.
popen()
eliminates disk I/O overheads3_http
module for direct S3 operationss3_io
module providing familiar Fortran I/O patterns (open/read/write/close)s3://bucket/key
URIs for seamless cross-bucket operationsThe library follows a layered architecture with three core modules:
curl_stream
Provides C interoperability layer for high-performance streaming:
stream_command_output(command, output, exit_status)
: Stream command output directly to memory via popen()
is_streaming_available()
: Platform detection for streaming capabilitypopen
, fread
, pclose
) for zero-copy streamings3_http
Provides direct curl-based HTTP access to S3 operations:
s3_config
: Configuration type containing bucket name, region, endpoint, credentials, and protocol settingss3_init(config)
: Initialize the library with an S3 configurations3_get_object(key, content)
: Download object content from S3s3_put_object(key, content)
: Upload object content to S3 (requires authentication)s3_object_exists(key)
: Check if an object exists using HTTP HEAD requestss3_delete_object(key)
: Delete an object from S3 (requires authentication)For convenience, the library provides URI-aware versions of all operations:
s3_get_uri(uri, content)
: Get object using s3://bucket/key
formats3_put_uri(uri, content)
: Put object using URIs3_exists_uri(uri)
: Check existence using URIs3_delete_uri(uri)
: Delete object using URIThese functions automatically parse the bucket name from the URI and temporarily switch contexts when accessing different buckets.
s3_io
Provides Fortran-like I/O interface built on top of s3_http
:
s3_open(unit, key, mode, iostat)
: Open an S3 object for reading or writings3_close(unit, iostat)
: Close the S3 object (uploads on write mode)s3_read_line(unit, line, iostat)
: Read a line from the objects3_write_line(unit, line, iostat)
: Write a line to the object buffers3_rewind(unit, iostat)
: Rewind read position to beginningThis module manages up to 100 concurrent file handles with internal buffering.
The library includes comprehensive testing infrastructure using the test-drive framework:
# Run all tests with mock curl
PATH="test/scripts:$PATH" fpm test
# Tests are defined in test/test_s3_http.f90
# Mock responses stored in test/data/responses/
Access climate model output, observational data, and analysis results stored in S3:
The library excels at reading from public S3 buckets commonly used in scientific computing:
The library uses POSIX popen()
for high-performance streaming on supported platforms:
This eliminates disk I/O bottlenecks present in v1.0.0, providing production-grade performance for HPC and scientific computing workflows.
curl
command to be available in PATHDownload a text file from a public S3 bucket:
program basic_download
use s3_http
implicit none
type(s3_config) :: config
character(len=:), allocatable :: content
logical :: success
! Configure for public NOAA bucket
config%bucket = 'noaa-gfs-bdp-pds'
config%region = 'us-east-1'
config%endpoint = 's3.amazonaws.com'
config%use_https = .true.
call s3_init(config)
! Download README
success = s3_get_object('README.md', content)
if (success) then
print *, 'Downloaded ', len(content), ' bytes'
print *, content
else
print *, 'Download failed'
end if
end program basic_download
Verify an object exists before attempting to download:
program check_exists
use s3_http
implicit none
type(s3_config) :: config
logical :: exists
config%bucket = 'my-bucket'
config%use_https = .true.
call s3_init(config)
exists = s3_object_exists('data/input.nc')
if (exists) then
print *, 'File exists, proceeding with download...'
else
print *, 'File not found, using default data'
end if
end program check_exists
Read an S3 object line by line like a regular file:
program read_lines
use s3_http
use s3_io
implicit none
type(s3_config) :: config
integer :: unit, iostat, line_count
character(len=1024) :: line
! Initialize
config%bucket = 'my-data-bucket'
call s3_init(config)
! Open S3 object
call s3_open(unit, 'data/measurements.csv', 'read', iostat)
if (iostat /= 0) stop 'Failed to open file'
! Read all lines
line_count = 0
do
call s3_read_line(unit, line, iostat)
if (iostat /= 0) exit
line_count = line_count + 1
print *, 'Line ', line_count, ': ', trim(line)
end do
call s3_close(unit, iostat)
print *, 'Read ', line_count, ' lines'
end program read_lines
Use s3://
URIs to seamlessly access multiple buckets:
program uri_access
use s3_http
implicit none
type(s3_config) :: config
character(len=:), allocatable :: content
logical :: success
! Initialize with default bucket
config%bucket = 'default-bucket'
config%use_https = .true.
call s3_init(config)
! Access different bucket via URI
success = s3_get_uri('s3://other-bucket/path/to/data.txt', content)
if (success) then
print *, 'Content from other-bucket: ', content
end if
! Original bucket still accessible
success = s3_get_object('local-data.txt', content)
end program uri_access
Download and process NetCDF files from ESGF climate data archives:
program climate_data
use s3_http
use netcdf
implicit none
type(s3_config) :: config
character(len=:), allocatable :: nc_data
character(len=*), parameter :: climate_uri = &
's3://esgf-world/CMIP6/CMIP/AWI/AWI-ESM-1-1-LR/piControl/r1i1p1f1/fx/areacella/gn/v20200212/' // &
'areacella_fx_AWI-ESM-1-1-LR_piControl_r1i1p1f1_gn.nc'
logical :: success
! Configure for public ESGF bucket
config%use_https = .true.
call s3_init(config)
! Download NetCDF file
success = s3_get_uri(climate_uri, nc_data)
if (success) then
! Write to temp file and process with NetCDF library
open(10, file='/tmp/climate.nc', form='unformatted', access='stream')
write(10) nc_data
close(10)
! Now use NetCDF library to read the file
print *, 'Climate data downloaded, ready for processing'
end if
end program climate_data
Upload data to S3 (requires AWS credentials):
program write_data
use s3_http
implicit none
type(s3_config) :: config
character(len=:), allocatable :: data
logical :: success
! Configure with credentials
config%bucket = 'my-output-bucket'
config%access_key = 'AKIAIOSFODNN7EXAMPLE'
config%secret_key = 'wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY'
config%use_https = .true.
call s3_init(config)
! Create data
data = 'simulation results: temp=25.3, pressure=1013.2'
! Upload to S3
success = s3_put_object('results/output.txt', data)
if (success) then
print *, 'Data uploaded successfully'
else
print *, 'Upload failed'
end if
end program write_data
Clone the repository:
bash
git clone https://github.com/pgierz/fortran-s3-accessor.git
Build with FPM:
bash
fpm build
fpm test # Run tests
Or build with CMake:
bash
mkdir build && cd build
cmake .. -DCMAKE_BUILD_TYPE=Release
make -j
Run examples:
bash
fpm run test_simple # Basic operations
fpm run --example s3_netcdf_example # NetCDF example
To use in your project with FPM, add to fpm.toml
:
[dependencies]
fortran-s3-accessor = { git = "https://github.com/pgierz/fortran-s3-accessor.git" }
Then in your code:
use s3_http ! For direct S3 operations
use s3_io ! For Fortran I/O interface