OnApp Has Been A Nightmare. OnApp Review.
This is the first time we use our blog to post our experience with one of our ex-vendors that we used to outsource cloud hosting management services. Specifically, this article is about our bad experience with OnApp. I will include as much details as possible in order to be as helpful as possible to any future companies considering to use their services. I want to stress out that I would not write this long OnApp review but the way they treated us made me do it because in todays world such behavior must result in lost customers and lost future clients.
Everyone should be aware what to expect by becoming a customer of OnApp. I will explain about the most serious issues because in 4 years we had to open more than 240 technical support tickets due to problems with their software which makes it 5 tickets per month and 1.25 tickets per week. In many of the cases their support staff didn’t know what they were doing and they even caused full data loss for 2 servers. Good that we had backups to restore them.
OnApp Deployment & Some History
We decided to test OnApp for the first time 5 years ago. We contacted their sales department, they issued a license, we ordered the necessary hardware and network equipment to build our first cloud server hosting cluster. After the installation the first problem happened that hypervisors would not be detected as online. We contacted OnApp and they said they would not provide support to free licenses. After some days of dealing with support and management of OnApp they decided to help us test their service.
We created two VM’s but when we tried to remove them the process has failed so we opened another ticket. OnApp could not tell us why the process was failing. They removed them somehow and never told us what happened and what caused that issue. After that we had multiple problems with OnApp integrated storage. That’s a cheap storage solution they provide bundled with the management software. Do not make a mistake to use it. You will have huge headaches with it like we had and you may lose data like we did. I will explain further in the next lines just keep reading.
OnApp Caused Data Loss
4 years ago we had the first big issue with OnApp where due to an internal bug they could not explain in detail one virtual machine has lost all data. It was corrupt and the data could not be saved. Good that we were in testing phase and no customer’s data has been affected but if the cloud service was live we would have had data loss due to OnApp.
OnApp Storage Performance
The performance we have seen from the OnApp integrated storage has always been awful and we used enterprise Intel SSD drives. If we ran a benchmark on a drive that is not a member of the integrated storage it would show the performance advertised by Intel but if we do that on a virtual machine created by OnApp using their storage software it will show really awful results and bad performance.
The OnApp vdisks were often getting out of sync and we had to manually monitor them to initiate new syncs. That is a huge time waster when you get some hundreds of vdisks to watch for. I always wondered if all vdisks were ok and get to regularly login to OnApp control panel to verify that and repair the broken vdisks by manually initiating syncs. That wasted lots of time. The response from OnApp has always been “we are working on it” without anything happening actually. After some months of testing we decided not to use it and give it a chance again a little later in case it improves.
New Testing Has Started
Some months later we decided to give them another chance and see if they have improved. After many many tickets and wasted time we got OnApp configured to a state where we decided to put it into production. That was 4 years ago. The installation and setup didn’t go as described in their documentation so we had to contact support almost daily. We announced the new service and started selling cloud hosting to customers. We were having issues every week and contacting their support but the same issues were happening again and again. No permanent fixes were happening. OnApp were focused on gaining new customers instead of supporting their existing ones. Integrated storage was a total disaster and was constantly wasting our time because we had to manually repair disks which were getting out of sync for no reason. OnApp could not explain why that was happening.
Every few weeks due to integrated storage disks getting out of sync OnApp could not repair them and they had to format the entire physical disk in order to add it back to the integrated storage and repair the disks. Then the repair was taking long because they had to copy all the data to the formatted disk. If the other disk containing the second copy of the data has failed we would have data loss for many cloud servers. OnApp was assuring us they were working on improving integrated storage and it is a matter of some time to get everything stable. That never happened.
First Serious OnApp Failure
We were having many small problems with the OnApp management platform every now and then which we could get fixed by opening multiple tickets. The software was never stable enough to work without problems for more than some days. We always had to contact technical support for help. In many cases, their first line support technicians didn’t know what they were doing and the quality of support was really bad. That’s why I always had to ask for the ticket to be escalated to admins.
One of the hypervisors in the cloud hosting cluster has failed and the migration feature which was supposed to migrate and start all cloud servers on another hypervisor, failed to do so. The process was failing and we could not get the affected cloud servers up. We opened a ticket with support and it took ages for them to resolve the issue. Finally, they did. The problem with the migration feature stopping to work has been happening regularly and we had to contact technical support, wait for the ticket to be escalated because only administrators could fix it. It was a very unpleasant experience but at that time OnApp was the only choice for web hosting companies to sell cloud hosting and OnApp had integration with the WHMCS billing system. That plugin had lots of bugs as well and required many many hours to get it working.
Second Serious OnApp Failure
This was the worst experience we had with OnApp because it resulted in a complete corruption of one of the cloud servers. Let me explain what happened. A virtual server became unresponsive and we could not start or restart it from the OnApp control panel. The restart was just failing to happen. We opened a ticket with OnApp and they started investigating. They never told us what they did exactly but corrupted all copies of the disk of the cloud server. Suddenly, they stopped responding to us in the ticket and I called their support number. The person I talked to said he didn’t know what OnApp is and they can’t help. It was clear to me that they have outsourced their phone support to some UK call center and the guy on shift didn’t know the company he is supposed to serve. I was finally able to find the phone number of one of their sales managers who told me their admins would come in 8 hours. I asked him don’t you have 24/7 systems administrators but the answer was clear. When their administrator came to work he confirmed the cloud server is corrupt and advised to create a new server but the data would be lost. The huge problem was that the customer of that server didn’t order backups with it. Good that he moved to a cloud server recently and we copied his data from the backups of the shared hosting server he was using. He was very lucky and OnApp failed a big time.
So, we have had many many problems in how they handled this problem.
- Their support disappeared at some point while they knew there was a problem (maybe their shift ended and they just left).
- They corrupted all copies of data and caused full data loss.
- Their phone support didn’t know anything about the existence of OnApp.
- They could not provide 24/7 support to us due to no admins on shift which caused more downtime for our customer.
Third Serious OnApp Failure
This problem once again resulted in complete data loss because OnApp corrupted the disk of the cloud server. At that time OnApp supported shrinking of disks. A customer placed an order for their disk size to be decreased. The process was started and OnApp initiated the shrinking. That took very long which is expected but the customer said it was taking too long and decided to cancel the process from the OnApp user interface. What has happened is that OnApp stopped the process but ran a file system check on the mounted disk which wiped all data. They didn’t accept that although we provided logs with timestamps confirming the same. The final result was that we had to restore the data from backups once again due to an action performed by OnApp. At that point my decision has already been made and we were working on finding an OnApp alternative which we did and since some months we have a cloud hosting cluster which didn’t have a single problem.
OnApp Provided The Worst Technical Support We Have Ever Seen
We have dealt with 50+ software companies, datacenters and other IT vendors. The support at OnApp was the worst I have seen in my life. Many of the problems we reported to them were never resolved although they said they would fix them “soon”. I could wait for that “soon” 2 years before I decided to drop OnApp and stop doing business with them.
I will mention some of the problems that were never resolved while we have been working with them but I also know that those problems are still not resolved.
- CentOS servers being deployed with broken hostname – each cloud server deployed by OnApp and using CentOS had a broken hostname. The bad things about an incorrect hostname are many. First of all, email messages will bounce because if you want to create a server with hostname cloud.domain.com, OnApp will setup the hostname as just “cloud”. That is not a valid hostname. We had to login to each provisioned cloud server manually to configure the hostname and do what OnApp was supposed to do automatically for us. Many months has passed and I have talked to many managers about this issue and no solution has ever been provided. I could not imagine how on earth you would have such a bug for such a long time. Basically, the software was provisioning miss-configured servers. Some of their marketing staff was calling me every 2-3 months to ask me how happy I was with OnApp. I was always telling them I was completely unhappy and the reason we were still a customer was because there wasn’t another option on the market which would have integration with WHMCS and which we can use. At the moment that is the cause for thousands of web hosting companies. They were telling me they would fix it in the next release and it was never getting fixed in the next release. 3 months later I had the same call again and again but with no result. I could not believe how a company with such a bad support can survive and be still in business. I was even thinking about starting a software company to compete with OnApp because it will not be hard to beat that kind of technical support. I am sure that will happen and some company will do it.
- Adding IP’s was causing kernel panic – first of all OnApp doesn’t support communication with the virtual machines on a different port. We had SSH configured on a different port because all SSH brute-force attacks happen on it and it is really not smart to keep SSH running on port 22. OnApp requires that. What we did for a workaround was to setup a second SSH service listening on port 22 and enable access just for OnApp to use it. OnApp was now able to connect to the virtual machine but whatever it was doing to rebuild the network and add the new IP was always resulting in a kernel panic and the cloud server had to be rebooted. We were forced to stop using OnApp to add the IP’s to the servers and started adding them manually to avoid downtime. We have opened many tickets but OnApp did nothing to resolve the problem. They started blaming our OS template. I told them that is their OS template with the cPanel control panel installed. They said since the template is modified they can’t support it. Then I asked so what do you expect from your customers using OnApp cloud servers – to use a minimal OS install and not to install anything on it because it becomes unsupported? That means OnApp is useless. One of their admins got involved, confirmed the issue is a bug and they would work to fix it. It has never been fixed.
OnApp Provided The Worst Customer Service We Have Ever Seen
They have always been customer unfriendly with us. I don’t know about all their customers but whoever web hosting owner I know who is using OnApp has the same opinion as mine. The problem is they can’t find a good OnApp alternative. I am happy we found it and now we can focus on selling cloud hosting instead of fixing the problems of the cloud management platform on weekly basis.
When they understood that they were losing our business they emailed and asked if it was the price and they can decrease it to save a customer. With all that past experience with their software and support I would not consider to use them even they give us the software for free. It is a headache. It doesn’t work as expected and support is awful. Everyone using OnApp I know tells me the same.
OnApp charges based on the number of hypervisors and their CPU cores. When we started migrating the cloud servers to the new cloud hosting cluster and removing hypervisors I thought our monthly cost will be decreased since we decreased the number of CPU cores in OnApp but they refused to decrease the price. I was shocked but I didn’t have the time and I didn’t want to go through headaches in tickets and calls to try to explain that when you add 20 CPU cores and pay $200 more you also expect to pay $200 less when you are not using those 20 CPU cores anymore and they are removed from OnApp. This is cloud, isn’t it? With OnApp cloud is not cloud. It is defined to serve their best interest only. That is when they made me write this article and share our unpleasant experience with them. We are not the only one who has done it. If you search in Google you will find many other similar reviews about OnApp. Here are some of them.
Personally, I have read horror stories about OnApp integrated storage where it lead to multiple days of downtime for hosting providers with hundreds of cloud servers. I certainly didn’t want to go through that and found an alternative to switch to. If you are using OnApp integrated storage you better find a new SAN storage system or you risk alot.
Conclusion: We Are Finally Rescued 🙂
I am happy that we are safe now since we managed to escape from the OnApp nightmare. We now have an ALL SSD cloud hosting cluster which has 10x better performance and most importantly works without the natural and normal OnApp issues.
With OnApp the provisioning of a server took 3-4 minutes but with our new cloud management platform we can activate multiple cloud servers in seconds. We now have an in-house system where we got 100% control to manage it, update it and add new features whenever we like.
Adding more disk space in OnApp requires a reboot. We can now add more space without one because it is not really needed. OnApp is just not properly configured. It uses old virtualizaiton software versions not properly configured. The performance of the customers cloud servers migrated from OnApp to the new cloud hosting cluster resulted in 25% better performance for the clients without adding more CPU resources or RAM.
Dropping OnApp and using hypervisors configured by us let us free up lots of SSD disk space from our SAN storage. If you deploy 1000 cloud servers from a 10GB OS template in OnApp, you would need 10TB disk space. If we deploy 1000 cloud servers in our custom cloud hosting cluster setup, we will only need 10GB to do so. That shows how ineffective OnApp is. The most you grow the most inefficient you become by using OnApp and at some point you will be forced to drop it and make the switch. It is much easier to do it when you have some hundreds of virtual machines instead of doing it when you have thousands and the migration process becomes much harder.
Looking for an alternative?
I was contacted by several companies looking for an alternative solution and asking about ours. I’ve build private and hybrid cloud clusters for some of them using our in-house solution with WHMCS integration fully managed by us.
I will describe how it works below.
- We setup and manage a private or hybrid cloud cluster for you.
- We provide you with the WHMCS integration which allows you to automatically provision and manage cloud VM’s. That includes activation/suspension/termination, upgrades/downgrades of individual resources (CPU, RAM, disk, network speed, OS reload, console) plus statistics about the resource usage of the VM.
- We monitor all storage nodes and hypervisors and take immediate actions if anything goes wrong.
- The customer pays for the current provisioned resources on the 1st of each month (number of CPU cores, RAM, storage).
- For some customers we provide management on OS level for CentOS/RedHat virtual machines.
Please do not contact us if you just want to find out what we are using. We have spent lots of time and efforts to develop the software and tools to build and manage an enterprise cloud solution with high availability and performance which works 24/7/365 and we will not provide it for free.
If you want us to build a private or hybrid cloud cluster for you and integrate it in your infrastructure with WHMCS to automatically sell cloud VM’s with true high availability, please email me at vrobinson [at] scalahosting.com.