Bug #2459
Archive Historical MongoDB Collection Data to S3 (CSV + Compressed Format)
Description
We need to implement a process to archive historical data from MongoDB collections. The archival process should extract old data, convert it into CSV format, compress it, and store it securely in an S3 bucket for long-term storage and cost optimization.
Scope of Work:
Data Identification
Identify historical data based on configurable criteria (e.g., records older than X months).
Ensure active/critical data is excluded.
Data Extraction
Fetch data from MongoDB collections in batches to avoid performance impact.
Support multi-tenant collections (if applicable).
Data Transformation
Convert extracted data into CSV format.
Ensure proper schema mapping and column headers.
Compression
Compress CSV files using one of the following formats:
.zip (preferred)
.gzip (optional)
Ensure optimal file size for storage and transfer.
Storage (AWS S3)
Upload compressed files to designated S3 bucket.
Define folder structure:
s3://<bucket-name>/<tenant>/<collection>/<year>/<month>/
Enable versioning and lifecycle policies (if required).
Post-Archival Handling
Option to delete or mark archived records in MongoDB.
Maintain audit logs for archived data.
Automation
Schedule via cron/job scheduler (e.g., daily/weekly).
Retry mechanism for failed uploads.
Acceptance Criteria:
Historical data is correctly identified and extracted.
Data is converted into valid CSV format.
Files are compressed and uploaded to S3 successfully.
No performance impact on live MongoDB operations.
Logs are maintained for audit and traceability.
Process is configurable (date range, collection name, tenant, etc.).
Technical Considerations:
Use streaming/batch processing to handle large datasets.
Ensure secure S3 upload (IAM roles / access keys).
Handle failures and partial uploads gracefully.
Validate data integrity after archival.
Updated by Shreya Agarwal about 2 months ago
Today, I continued working on the Archive Process for station setup steps in the EVCSMS project. The work is currently in progress. I have analyzed the table structure and started implementing the archival flow, which involves identifying old records, converting them into CSV format, compressing them into ZIP files, and preparing for upload to AWS S3.
Updated by Shreya Agarwal about 2 months ago
completed the development and implementation of the Station Archive functionality. The archive process is successfully handling data extraction based on the selected month and year, converting the data into CSV format, compressing it into a ZIP file, and uploading it to AWS S3 using tenant-specific configuration.
In addition to this, I have started working on the Common Archive Service. The initial design and structure have been defined to make the archive process reusable across multiple modules. Development for this common service is currently in progress and is yet to be completed.
Updated by Shreya Agarwal about 2 months ago
completed the Archive Station module with proper API flow, including input validation, tenant-based data fetching, and secure S3 upload integration. I also implemented consistent API response handling to ensure correct success and failure messages based on data availability. Additionally, I developed and completed the Common Archive Service to support reusable archive functionality across multiple modules. Both Archive Station and Common Archive Service are now implemented and working as expected with proper logging and error handling.
Updated by Shreya Agarwal about 1 month ago
Today I worked on minor updates and fixes in the Archive module, focusing mainly on improving its stability and response handling.
Along with development tasks, I also focused on learning and understanding some .NET concepts to strengthen my development skills. I tried to relate the concepts I learned with my current project to improve implementation quality and practical understanding.
Updated by Shreya Agarwal about 1 month ago
focused on understanding the Station Setup APIs by thoroughly reviewing the API documentation to gain clarity on the functionality, request/response structure, and overall workflow. Executed initial test cases for key APIs and validated the responses against expected outcomes. Identified and documented a few observations, and clarified queries with the team. Additionally, performed basic validation checks, including request payload verification and response handling.
Updated by Shreya Agarwal about 1 month ago
try to analyzed the Calculate Charger Health Scores API to understand how charger health is calculated step by step. I tried to understand how tenant-wise data is handled and how API key and JWT authentication flows are working. And also reviewed how service and repository layers fetch data . The final score is calculated using a simple weighted formula: Success Rate, Uptime, Fault Score, and Power Performance (with weights like 0.30, 0.35, 0.25, 0.10). Based on this, I identified areas for improvement and started planning changes for better configuration and understanding.
Updated by Shreya Agarwal about 1 month ago
worked on implementing and refining the Charger Health configuration-based scoring system. First, I moved all scoring parameters (weights, threshold, and fault penalty) to app settings and set up proper configuration binding using Charger Health Options with IOptions, replacing the earlier hardcoded values. After that, I verified the configuration mapping in the repository and ensured that values are being correctly injected at runtime in the scoring logic. Then I integrated and tested the updated configuration by modifying different values in app settings for multiple scenarios. Finally, confirmed that the scoring system is working correctly.
Updated by Shreya Agarwal about 1 month ago
Today I worked on the Charger Health module and continued understanding its API flow and logic. I also try to understand tenant provisioning and rollback handling . Along with this, I started reviewing the company-related module, though some parts are still pending and will continue . Overall, I focused on improving my understanding of the codebase and identifying remaining changes and edge cases.
Updated by Shreya Agarwal about 1 month ago
worked on the Charger Health module and reviewed the remaining logic and edge cases. Along with this, I continued exploring the company module and tenant provisioning flow, though some parts are understood but other still pending . Overall, I focused on improving my understanding of the system and identifying areas for required changes.
Updated by Shreya Agarwal about 1 month ago
Today, I worked on the Customer Support module and focused on understanding the existing implementation, including categories and related APIs.
I developed two APIs:
Status Count API – to fetch support ticket counts grouped by status (Open, Assigned, Closed, etc.).
Support Manager Users API – to fetch the list of users having the role “support_manager” using proper table joins.
Updated by Shreya Agarwal about 1 month ago
Today, I worked on the Customer Support module and shifted the API implementation to the Support Manager Controller as per requirement. I implemented two APIs: Support Status Count and Support Manager Users. I analyzed the users, roles, and user_roles tables and verified the queries in pgAdmin. I identified an issue where the API was returning empty data despite correct DB results and fixed it by correcting tenant/schema handling. Finally, I tested both APIs to ensure proper functionality.
Updated by Shreya Agarwal about 1 month ago
continued working on the Support Manager module. I also explored the project to trace the data source for the system_issues table, but I could not get complete clarity yet on where all the data is coming from and will continue working on it.
Additionally, I worked on the assign API, but due to insufficient permissions, I was not able to run and fully test the API, so the output could not be verified.
Updated by Shreya Agarwal about 1 month ago
worked on integrating system-generated issues with the support module and added logic in the wallet top-up transaction service to auto-create support tickets for payment failures. Verified the basic flow in Swagger and confirmed data is getting saved in support tickets table. However, complete testing and validation of all scenarios is still pending. Also worked on Support Ticket Assign API changes . Basic implementation is completed from code side. Full testing for both flows is still pending. Assign API testing could not be completed today due to insufficient permissions for required scenarios. Remaining testing and validation will continue next.(branch 2426)
Updated by Shreya Agarwal about 1 month ago
Started working on the heartbeat flow to create support tickets for machine offline/online scenarios .The code changes have been completed and integrated into the service . While testing the heartbeat API , I found an issue related to API key validation (invalid API key) . To proceed , I temporarily commented the API key validation and tested API , which returned some success response. However the database verification is still pending , as proper validation will require correct API key configuration or simulator based testing .
Once the correct API key is available , I will proceed with end-to-end testing to verify ticket creation in the support tickets table.
(branch 2426)