Menu

Daily archives "July 30, 2015"

3 Articles

OpenNebula

OpenNebula 4.10 dashboard

OpenNebula 4.10 dashboard, running on 4-compute 5-store cluster

This could have been the first article of this blog. OpenNebula is a modular cloud-oriented solution that could be compared to OpenStack, driving heterogeneous infrastructure, orchestrating storage, network and hypervisors configuration.

In the last 7 months, I’ve been using OpenNebula with Ceph to virtualize my main services, such as my mail server (200GB storage), my nntp index (200GB mysql DB, 300GB data), my wiki, plex, sabnzbd, … pretty much everything, except my DHCP, DNS, web cache and LDAP services.
A few before leaving Smile, I also used OpenNebula and Ceph to store our system logs, involving Elasticsearch, Kibana and rsyslog-om-elasticsearch (right: no logstash).

This week, some customer of mine was asking for a solution that would allow him to host several Cpanel VPS, knowing he already had a site dealing with customer accounts and billing. After refusing to use my scripts deploying Xen or KVM virtual machines, as well as some Proxmox-based setup, we ended up talking about OpenNebula.

OpenNebula 4.12 dashboard

OpenNebula 4.12 dashboard, running on a single SoYouStart host

The service is based on a single SoYouStart dedicated host, 32GB RAM, 2x2T disks and a few public IPs.
Sadly, OpenNebula is still not available for Debian Jessie. Trying to install Wheezy packages, I met with some dependency issues, regarding libxmlrpc. In the end, I reinstalled the server with the latest Wheezy.

From there, installing Sunstone, OpenNebula host utils, registering localhost to my compute nodes and my LVM to my datastores took a couple hours.
Then, I started installing centos7 using virt-install and vnc, building cpanel, installing csf, adding my scripts configuring network according to nebula context media, … the cloud was operational five hours after Wheezy was installed.
I finished by writing some training support (15 pages, mostly screenshots) explaining the few actions required to create a VM for a new customer, suspend his account, backup his disks, and eventually purge his resources.

OpenNebula VNC view

OpenNebula VNC view

At first glance, using OpenNebula to drive virtualization services on a single host could seem overkill, to say the least.
Though having a customer that don’t want to know what a shell looks like, and when even Proxmox is not an acceptable answer, I feel confident OpenNebula could be way more useful than we give it credit for.

Crawlers

Hosting public site, you’ve dealt with them already.
Using your favorite search engine, you’re indirectly subject to their work as well.
Crawlers are bots, querying and sometimes mapping your site, completing some search engine database.

When I started looking at the subject, a few years back, you only had to know about /robots.txt, to potentially prevent your site from being indexed, or at least restrict such accesses to relevant contents of your site.
More recently, we’ve seen the introduction of some XML files such as sitemap, allowing to efficiently serve to search engines a map of your site.
This year, Google reviewed his “rules” to prioritize responsive and platform-optimized sites as well. As such, they are now recommending to allow crawling for JavaScript and CSS files, warning –threatening– that preventing these accesses could result in your ranking being lowered.

At this point, indexing scripts and style-sheets, you might say – I’m surprised not to find the remark in the comments – that google actually indexes your site vulnerabilities, creating not only a directory of the known internet, but a complete map with everyones’ hidden pipes, that could some day be used to breach your site – if not already.

Even if Google is the major actor on that matter, you probably have dealt with Yahoo, Yandex.ru, MJ12, Voltron, … which practices are similar. Over the last years, checking your web server logs, you might have noticed a significant increase in the proportion of bot queries over human visits. In part due to search engines recrudescence, though I suspect mostly thanks to bot nets.
Identifying these crawlers could be done checking the UserAgent, sent with all http requests. On small traffic sites, crawlers may very well be your only clients.

Assuming your sites are subject to DDOS attacks, scans for some software vulnerability (top targeted solutions being wordpress, phpbb and phpmyadmin), you should know attackers will eventually masquerade their user-agent. Most likely branding themselves as Googlebot.

To guarantee a “googlebot” branded query actually comes out some google server, you just need to check the pointer record associated to this client’s IP. A way to do so in Apache (2.4) could be to use something like this (PoC to complete/experiment).
Still, maybe is it wiser to just drop all google requests as well. Without encouraging Tor usage, it’s probably time to switch to DuckDuckGo?

Otherwise, an other good way to deny these connections is described here, I may try to add something like this to my puppet classes.

Don’t trust the Tahr

Beware that since latest Ubuntu kernel upgrades (14.04.02), you may lose network rebooting your servers!

I’ve had the problem four days ago, rebooting one of my OpenNebula hosts. Still unreachable after 5 minutes, I logged in physically, to see all my “p1pX” and “p4pX” interfaces had disappeared.
Checking udev rules, there is now a file fixing interfaces mapping. On a server I have not rebooted yet, this file doesn’t exist.

The story could have ended here. But with Ubuntu, updates is a daily struggle: today, one of my ceph OSD (hosting 4 disks) spontaneously stopped working.
Meaning: the host was still there, I was able to open a shell using SSH. Checking processes, all ceph osd deamon were stopped. Starting them showed no error, while processes were still absent. Checking dmesg, I had several lines of SSL-related segfaults.
As expected, rebooting fixed everything, from ceph, to my network interfaces names.
It’s in these days I most enjoy freelancing: I can address my system and network outages in time, way before it’s too late.

While I was starting to accept Ubuntu as safe enough to run production services, renaming interfaces on a production system is unacceptable. I’m curious to know how Canonical dealt with that providing BootStack and OpenStack-based services.

Note there is still a way to prevent your interfaces from being renamed:

# ln -s /dev/null /etc/udev/rules.d/75-persistent-net-generator.rules