Sunday, June 26, 2011

Field Insight #1: Is Public Cloud the Dinosaur extinction event for IT organizations and CIOs?

This is the first in a series of "Field Insights" - insight from conversations I am having with executives at some of the largest global organizations in the world.


The genesis of this article comes from a conversation with the IT and development leadership of a Fortune 100 company: specifically the enterprise architects in the centralized IT organization and engineering/development management in one of its business units.  The company's centralized IT organization is building an internal private cloud for deployment of PHP applications to be leveraged by all its various business units.  However, the business units are comparing the internal cloud offering against the cloud offerings of Amazon, Rackspace and other public cloud providers - time to market, flexibility, speed of ramp-up, etc. being the selection criteria.  


In fact, one of the business units deployed one of their largest applications (especially from a brand perspective) on Amazon's EC2 infrastructure (and their goal is to eventually move all their applications to the cloud - internal or public - within the next 18 months).  This one application gets about 60 million hits on a certain day every week:  amazon's cloud infrastructure is perfect for them since the flexibility of being able to provision 100+ additional servers that day, and then bring them down gives them the most efficient cost structure aligned with their variable need.


A couple interesting implications of this deployment into the public cloud by the development organization:
  • The development organization deployed the application into the public cloud, without needing approval from the centralized IT organization.  This means, centralized IT lost the ability to enforce their architecture/security policies on the business unit.
  • Payment to Amazon is being done directly by the development organization.  This means, centralized IT lost the budget they used to charge-back to the business units.
This example demonstrates that CIOs and Internal IT organizations now have a new competitor in the public cloud providers.  In order to effectively compete, and stay relevant to the business units, they need to deliver platform as a service capabilities such as rapid provisioning of completely built development environments, standardized deploymentauto scaling of production environments and metering of usage to create the appropriate chargebacks.

There are also significant implications for vendors:
  • The procurement and legal approval process for vendors to sell to this company are extremely stringent (3-6 month process with significant concessions needing to be made by the vendor). Vendors can now significantly reduce the effort and time to get approval to sell into company, by creating AMI instances with their software pre-provisioned, and allowing Amazon to become their reseller.  The development organization when procuring the additional 100 AMI instances can easily add the vendor's version of the AMI instance to their cart.  They pay Amazon (which has already been approved by their procurement/legal team) and the vendor gets paid by Amazon.  
  • A key benefit for the development organization is that they pay for the vendor's software in the same way they pay for amazon AWS usage - based on utilization.  This means that the vendor has to create pricing to align with utilization.  If you are selling annual subscriptions, you can set the per hour pricing to be slightly higher than the cost of an annual subscription ((subscription price)/(365*24)), or if you sell perpetual licenses you could set your pricing to be amortized over the 3-5 year amortization schedule of a perpetual license. 
  • A vendor's cost of sale declines significantly since they just need to focus on helping the development organization understand the value of their software.  The marketing is done by Amazon, the procurement/legal costs are almost completely avoided, and you do not leave money on the table for excess utilization.
Here are some cloud statistics from the eweek article, IT Cloud Services Spending to Reach $72.9 Billion in 2015: IDC Report. Cloud computing will continue to reshape the IT landscape over the next five years as spending on public IT cloud services expands at a compound annual growth rate (CAGR) of 27.6 percent from $21.5 billion in 2010 to $72.9 billion in 2015. But the impact of cloud services will extend well beyond IT spending, according to research from IT analytics firm IDC. Cloud services are a critical component in a much larger transformation that IDC expects will drive IT industry growth for the next 25 years, the report said.


In 2015, public cloud services will account for 46 percent of net new growth in overall IT spending in five key product categories – applications, application development and deployment, systems infrastructure software, basic storage, and servers, according to the report. 


Definitions

Instances: These are the virtual machine instances that have been launched from an AMI. They can be in different states such as 'starting', 'running', and 'terminated'.
AMIs: These are the virtual machine images that you use to launch instances. They will have an OS and typically some software stack pre-installed for your convenience (Ubuntu 9, Zend Server -> pre-installed and running ready to serve your PHP apps when an instance is launched).

Friday, June 10, 2011

Cloud Development = 2X the Salary for Developers? Developers as Production Operations/System Administrators?


Agile development requirements are forcing developers to start delivering on production operational tasks. Developers are not as knowledgeable about the infrastructure upon which their applications run in production – servers, load balancers, switches, scale up, proactive monitoring, etc.  Before the cloud-era, they were extremely dependent upon the IT organization provisioning infrastructure for them.  This also lead to IT having a lot more say and power on the enterprise standards.

With the advent of the cloud, developers can deploy their applications to live public cloud production environments without needing approval from IT (or at least not asking for it – if you don’t ask, you don’t give someone the authority to say no).  However, along with this agile deployment methodology/environment, comes a new set of responsibilities: Service Level Agreements (SLAs) with the business still have to be met - the applications have to scale, be up and responsiveness as demanded by the application user.  

In order to meet SLAs of applications developers deploy in the cloud themselves, they have to manage the following types of activities (which used to reside with production ops teams):

  • Elastic scaling of the application based on varying demand
  • Proactively monitoring application health
  • Preventing misuse of root access to production (such as making impromptu changes in production)
  • Ensuring deployment success from development to production 

The opportunity for vendors provide solutions to help developers automate these types of tasks.  Below are some examples of toolings/solutions from the the PHP ecosystemto help developers in these DevOps dual roles:
  • Elastic scaling of the application: Developers do not want to think about how many servers/instances are needed to address peak capacity or low utilization periods, yet will now have to ensure the infrastructure is provisioned appropriately.  Platform as a Service solutions are evolving to provide just in-time infrastructure scaling based on varying demand to maximize end-user responsiveness while minimizing cloud computing costs.  When system load increases, additional servers are launched. As demand decreases, servers are automatically decommissioned, while ensuring no user sessions are lost in the process.
  • Proactively monitoring application health: monitoring at the transaction and application level is critical to maintain SLAs.  Developers already know what thresholds to set which when exceeded provide an indication that something might be trending towards failure.  In the past, there was a chasm between production and development teams, and hence this conversation around what thresholds to set and what to monitor for, never happened.  Now developers just need to talk to themselves.  Here is an example of the types of errors to be monitoring for and a monitoring solution that aggregates these errors across servers.  
  • Preventing misuse of root access in Production:  In the cloud, if developers deploy their applications into production, then they have 24X7 root access to production.  Ideally, the only reason they want access to production is to reproduce a production problem/try to fix it – with a high rate of feature/functionality releases (sometimes multiple releases a day) and the impact immediately felt by a large community of users, organizations cannot afford “bad” change.  Developers are tinkerers – they want to make what are perceived to be small changes to try out their fixes while in production (but in reality have large ramifications based on the legacy code that already exists).  Technology is evolving to eliminate the need to reproduce problems in production which will hopefully reduce the mean time to repair, and a developer’s requirement for access to production.
  • Ensuring deployment success from development to production: If both development and production are in the cloud, this could be as simple as exposing the cloud instance to the world.  However, if not, one has to ensure that versions of PHP, extensions, etc. are all same to ensure highest deployment success rates – Zend’s PHP application server (Zend Server) provides a pre-configured PHP stack via an Amazon AMI instance and on development workstations.
Controversial Question:  Is the Cloud significantly reducing the influence and control Production Operations organizations/CIOs have going forward?  Is more power in the hands of the developers?  Who and how should vendors be selling our solutions to?