Tools and Technology stack to support the applications written mainly in PHP, Golang and Node.js: Gitlab, Redis and Memcached + Mcrouter, Solr, HAProxy, DynamoDB, RabbitMQ, AWS SQS, MariaDB and AWS RDS/Aurora, StatusCake and NewRelic for monitoring and Ansible, Terraform and AWS for Infrastructure.
● Ensure production availability is secure, problems are resolved as quickly as possible, reducing impact to our customers
● Design infrastructure architectures that are highly available, scalable, and self-healing.
● Design, manage, and maintain tools to automate operational processes.
● Define and deploy monitoring, metrics, and logging systems.
● Security engineering, server and network hardening, auditing and reporting
Deployment of several microservices and server side rendering service with composer for a new Adpage for the Real Estate sites from OLX Europe on Kubernetes to replace legacy page. Product OKR objective.
Pushed the migration to cache cluster architecture using Redis 4 and LFU to replace old Memcached implementation without auto-discovery. Pack/Squad OKR objective.
Finished migration from MariaDB galera clusters to Aurora RD/S using ProxySql to warmup aurora nodes. Infrastructure OKR objective.
Lead DevOps Engineer in New Offers
● Responsible for all the infrastructure and its automation for the microsservice ecosystems to New Offers team products
The project in the news: http://www.portalnovarejo.com.br/2017/06/23/fatores-estrategias-omnichannel-varejo/
Lead DevOps Engineer
Tools and Technology stack to support the company product written mainly in PHP and Python: Jenkins for CI/Pipelines, Redis for cache, MongoDB, AMQP (RabbitMQ), ElasticSearch, MySQL, Ansible and AWS for insfrastructure.
Rearchitecture processes and product and stress/load tests and refactoring to allow onboarding one of the biggest insurance companies in Brazil. The traffic and MAU/DAU scaling tenfold. Platform used in initial release for realtime communications and workers engagement in many campaigns as the social network launching and corporate holiday party.
Reduced the average of unplanned downtime incidents per month from 8 to zero (3 months without unplanned downtime) working on monitoring, autoscaling and with developers to make the application more resilient (circuitbreak/fallbacks)
Tuned architectural components, services and linux kernel and worked alongside developers to scale the application tenfold (from 5k users to support 50k users) pinpointing bottlenecks during stress tests to onboard a large enterprise customer (Aug/2016 to Nov/2016)
Reduced cost of ec2 instances by more than 70% with use of AWS Spot Instances, mainly after working to scale the aplication tenfold and keeping the costs less than 50% higher to support 4 times the number of users
Improved security with detailed audit and control of privileges and removing hundreds of system accounts that weren't properly deprovisioned (broken automation) (Apr/2016)
Created many metrics and automated tests to check for system thresholds and alarm before critical level as result of investigative post-mortems and checks for blind spots on monitoring to know when the system breaks before a client calls
Improved the stability of CI/CD automation reducing the average of production deploy automation issues from 4 per month to 0 (3 months without issues)
Lead Devops Engineer
Tools and Technology stack used to support the company product written mainly in Java: Atlassian Bamboo for CI, Jetty, Nginx for reverse proxy, Oracle, PostgreSQL, Hadoop/Hbase/Phoenix, Logstash, AWS ELB, AWS Cloudformation, AWS CodeDeploy, Github, Qlik for BI.
Relied heavily on netflix tools for java ecosystem in cloud with resilience and scalability and hashicorp tools to support cloud management, immutable server, baked AMI, fast deploys and rolling updates.
Developed python Lambda functions to log, and announce using chatbot, key AWS API calls using CloudWatch Events announced in Jan/2016 and a panel to show events using Lambda/API gateway/S3 with angularjs (serverless) (Mar 2016).
Updated support to AWS EC2 command/SSM announced to support Linux in Dec 2015 (Jan 2016)
● Running Phoenix over Hadoop/HBase clusters POC in AWS EMR (Jan 2016).
Updated application stack using CodeDeploy and CloudFormation integration announced in Oct 2015 (Nov 2015)
● Created dashboard and new metric monitoring using AWS Cloudwatch Dashboard announced in AWS Re:Invent 2015 (Oct 2015)
Performed cross account migration and restructured VPCs and VPC-peering in AWS (Sep 2015)
Executed Oracle to Postgres migration and developed helper tools for Ora2pg to support the migration and validate migrated data using scripts and Python (Jul 2015)
Using Docker to package tools used in operations (Jun 2015)
PoC tests for logs and metrics monitoring using ELK,Graphite/InfluxDB/Grafana (Abr 2015).
Developed an automated data pipeline from the application to the BI system using AWS SWF, Python (Sqlparse,Boto, Fabric) and SQL scripts (Mar 2015)
Tested Troposphere to develop AWS cloudformation templates as code and added new Signal feature for update to reduce deployment time by half (Jan 2015)
Tools and Technology stack to support HPC, inhouse and thirdparty Oil E&P applications:JBOSS EAP cluster, SystemImager, xCat, LSF, Torque (PBS) for dynamic cluster provisioning and scheduling, Puppet, Xen, VMWare, SAN and NAS storages for insfrastructure.
Tested and implemented a solution to optimize bandwidth usage for updating Linux distro customized for onshore and offshore oil exploration drilling operations systems over WAN/VSAT/radio network links (2014).
Tested and implemented a free compression and opmization solution for the VNC protocol over VSAT/Radio network links (2014).
Tested and implemented first servers in regional IT using the Configuration Management tool tool Puppet (2013).
Contributor in the prospection and proof of concept of Remote HPC Workstation with VirtualGL/VNC, VDI, PCoIP, RGS, RemoteFX protocols and PCoIP/offload host board and virtualization as a corporate solution (2013).
Administration and provisioning of virtual servers utilizing VMWARE (2010)
Analista de Suporte UNIX da TIC-E&P
● Recommended, planned, tested and implemented high availability
computer cluster for a business application using Virtual IP
technology implemented in the opensource Keepalived project (based in
● Customized opensource software SystemImager for Petrobras UO-ES HPC
Cluster provision (2007)
● Recommended, planned, tested and implemented the first regional IT
department x86 Virtual Servers using Xen to achieve better independence
to hardware (facilitate the migration and upgrade of the license servers) and
better availability (2007)
● Tested and documented the corporate Solaris 8 Branded Zone implementation in
Solaris 10 (2006)
Software Engineer Internship
● Problem/System analysis and ER Mapping/Class diagram prototyping
● Install, create and maintain database and triggers,stored procedures, functions and migration scripts in PostgreSQL and MS SQL Server
● Control and develop C# source code for Multi-tier MVC Web applications using Subversion and Visual Studio
● Unit and Integration tests
IT Security Internship
● Run external penetration/vulnerability tests on client sites
● Run crafted network test scripts to check routers and switches security, stability and ACL rules
● Firewall configuration (Symantec and Linux) for security policy compliance
● Perform log analysis with log aggregators and log checkers written in Perl