Steven Marks
Lead Cloud Engineer & Full Stack Developer
Professional Summary

Lead Cloud Engineer and Full Stack Developer with over a decade of experience spanning cloud architecture, platform engineering and software delivery. Specialises in designing and operating production platforms on Google Cloud (GKE, Cloud SQL, Cloud Run), with infrastructure managed as code in Terraform/OpenTofu and GitOps continuous delivery through ArgoCD.

Combines hands-on Python full-stack development (Django, FastAPI) with technical leadership — setting direction, mentoring engineers and shipping features end-to-end while communicating effectively across all organisational levels.

Active open-source maintainer of a 600+ star project and author of technical tutorials at totaldebug.uk, with a track record of sustained delivery and community engagement.


Download PDF
Key Skills
Cloud & DevOps: Google Cloud Platform (GCP) Kubernetes (GKE) Docker Terraform / OpenTofu GitOps (ArgoCD) GitHub Actions (CI/CD) Ansible
Development: Python (Django, FastAPI) PostgreSQL REST API Design LLM / AI Integration
Leadership & Delivery: Team Leadership Mentoring Agile Delivery SLA Management
Experiences
Lead Cloud Engineer & Developer
Periphery Security Ltd
2024 - present

Sole cloud engineer for an early-stage security startup, owning the design and implementation of the entire GCP platform end-to-end. Lead developer for the backend API and frontend web application, while managing Agile sprint delivery across a small team of engineers.

Responsibilities
  • Architected, designed and implemented the complete cloud platform on GCP as the sole cloud engineer;
  • Led development of the backend API and frontend web application, taking features from design through to production;
  • Owned project delivery end-to-end using Agile practices, running sprint planning, stand-ups and retrospectives to track progress, remove blockers and ship releases on schedule;
  • Coordinated and mentored a small team of engineers, setting technical direction and reviewing their work.
Achievements
  • Built the entire GCP platform from the ground up, running production workloads on GKE with all infrastructure managed as code in OpenTofu/Terraform, cutting environment provisioning time from days to hours;
  • Designed a GitOps continuous-delivery pipeline using ArgoCD and Argo Image Updater, increasing deployment frequency to multiple times per week and reducing manual deployment effort by ~90%;
  • Lead developer for the Python backend (Django/FastAPI) and frontend web application, delivering 350+ features across 200+ versioned releases via automated semantic-release CI/CD;
  • Built SBOM analysis across 2 industry-standard formats (CycloneDX, SPDX), with configuration hardening checks spanning 13 control categories mapped to compliance frameworks including SOC 2, ISO 27001, PCI-DSS and the EU Cyber Resilience Act;
  • Integrated LLM/ML capabilities (Google Gemini / Vertex AI) to auto-generate device-aware remediation, eliminating manual fix research so findings ship with a ready-to-run command or one-click TUI auto-fix;
  • Lead contributor to EdgeWalker, the company's open-source edge security scanner (Python), driving the majority of its development to audit networks for open ports, default credentials and known vulnerabilities.
Technologies
Google Kubernetes Engine, Google Cloud SQL (Postgres), Google Cloud Functions (Python), Google Cloud Storage (GCS), Google Cloud Run (Python), Google Gemini Enterprise Agent Platform, Github Actions, Python (Django, FastAPI), Jira, OpenTofu / Terraform, Grafana, ArgoCD / Argo Image Updater, RabbitMQ, Mosquitto MQTT
Full Stack Systems Engineer III
Rackspace Technology
2020 - 2024

Responsible for project delivery, design, and development of business products and support tooling.

Responsibilities
  • Improved existing and new tooling;
  • Delivered projects within a Scrum framework, participating in sprint planning, daily stand-ups and retrospectives;
  • Completed approved projects within deadlines;
  • Guided and supported engineering teams;
  • Acted as a point of escalation for support teams and Full Stack Systems Engineers;
  • Worked with external stakeholders to qualify and document new products, both hardware and software.
Achievements
  • Developed analytics tools that identified billing discrepancies, recovering $1.2 million in revenue;
  • Developed and created a fully automated test lab using Terraform and Ansible, facilitating both user and automated testing;
  • Implemented an end-of-life program for outdated devices, generating an additional $600,000 in annual revenue;
  • Automated the deployment and configuration of the Datadog monitoring tool, enabling support teams to update configurations automatically from a set template for additional plugins.
Technologies
Jira, Python, Powershell, Github Actions, AWS SSM, Google BigQuery, PostgreSQL, Terraform, Ansible, Datadog
Full Stack Systems Engineer II
Rackspace Technology
2019 - 2020

Responsible for the design and development of business products and support tooling.

Responsibilities
  • Improved business workflows and tooling to provide a more efficient service;
  • Delivered work within a Scrum framework, participating in sprint planning, daily stand-ups and retrospectives;
  • Independently managed assigned tasks and communicated with task stakeholders;
  • Worked closely within the team and with external stakeholders, assisting senior engineers in qualifying and documenting new products, both hardware and software;
  • Identified areas of improvement within the team and the larger organisation;
  • Designed, engineered, architected and integrated tooling to reduce total costs, support customers, and minimise human error.
Achievements
  • Implemented single-click login for customer devices, saving an average of 15 minutes per device connection;
  • Spearheaded the successful migration of the Leeds data center and its clients, resulting in an annual cost reduction of £300,000 for the company;
  • Standardized the deployment and management processes of IBM BigFix for efficient patching and configuration management.
Technologies
Jira, Python, Powershell, Github Actions, Okd, PostgreSQL, Linux, Windows Server, Cisco, Juniper, IBM BigFix
Technical Support, Team Leader
Rackspace Technology
2017 - 2019

Led a team of 15 Windows Engineers, providing 3rd line support to our customer base.

Responsibilities
  • Guided the professional development of the Windows Engineering team through coaching, mentoring and regular performance reviews;
  • Conducted 1-to-1 meetings to provide personalised feedback and set individual growth plans;
  • Planning on-call rotas and shift patterns to ensure balanced workloads;
  • Fostered a collaborative team environment and encouraged continuous learning and skills development;
  • Reported on ticket queues and ensured adherence to SLAs.
Achievements
  • Created a Python and Grafana tool to track stale tickets and closure times, speeding up resolutions and increasing accountability;
  • Significantly reduced ticket queue lengths by implementing more efficient ticket management procedures and reporting.
  • Under my mentorship, the team ticket latency from 4 weeks to 30 minutes.
Technologies
ServiceNow, Python, Grafana, InfluxDB
Technical Lead
Datapipe (Acquired by Rackspace Technology)
2015 - 2017

Led the technical management of high-value enterprise customers, ensuring optimal solutions and acting as a key point of escalation for critical issues.

Responsibilities
  • Held overall technical responsibility for several high-value enterprise businesses;
  • Collaborated closely with the Account Director and Service Manager to coordinate teams and deliver superior service;
  • Tracked and progressed problems and projects to drive continuous improvement;
  • Ensured effective communication and alignment with key stakeholders to meet customer expectations and business objectives.
Achievements
  • Retained one of the company's largest customers, preventing their departure and securing an additional £1.2 million in revenue through expanded contracts;
  • Delivered a comprehensive CMDB device and relationship mapping, enabling enhanced support and impact analysis;
  • Conducted monthly usage analysis to identify potential savings and performance bottlenecks, leading to the discovery and mitigation of a major infrastructure bottleneck;
  • Enforced a billing audit to ensure accurate monthly billing for all devices;
  • Resolved major issues that were negatively impacting customer performance, significantly improving satisfaction and service delivery;
  • Successfully migrated customers from the Leeds based datacenter to a modern, state of the art datacenter in London.
Technologies
VMware vSphere, VMware vRealize, ServiceNow, Windows Server, Cisco ASA
Senior Hosting Engineer
Adapt (Acquired by Datapipe)
2014 - 2015

Designed and delivered customer solutions at a principal engineer level, maintained the datacenter, and provided guidance to junior engineers.

Responsibilities
  • Architected and managed the deployment of customer environments;
  • Improved technical documentation and procedures;
  • Supported the maintenance and operation of the datacenter.
Technologies
Juniper SRX and EX, VMware vSphere, vCloud Director, Xen Server, NetApp, Windows Server, Linux (RHEL & Ubuntu), DNS, PHP, MySQL
Certifications
Professional Cloud Architect - Google Cloud Certified
2020 - 2024
Google (6b843c80-71aa-44b3-b3c4-5e01ffb7b994)
AI Ready
2023
Rackspace Technology (93cf1769-57d6-45c8-b2d4-a3617b9ae672)
Projects

A selection of open-source projects I build and maintain.

Atomic Calendar Revive - An advanced calendar card for Home Assistant.
EdgeWalker - A high-performance edge security scanner that audits home networks for open ports, default credentials and known vulnerabilities.