darkness

Saturday, 27 August 2005

Fedora Core 1 to CentOS 4

darkness @ 13:19:24

Upgraded my FC1 firewall to CentOS 4 last night. Went fairly well, really. My procedure was something like:

  1. Follow yum upgrade instructions from CentOS forums. I had to install the kernel by hand, resolve a few dependencies, etc., but this worked not so bad.
  2. find / -name '*.rpm[sn]' (all the .rpmsave and .rpmnew files) and handle them.

Actually, there was a whole bunch of stuff I did. I’d say I should have written it down, but for anyone familiar with Linux it was just a bunch of typical tasks.

One thing that got me is this: CentOS 4 (and presumably RHEL 4) doesn’t support ReiserFS. At all. Used to be in a kernel-unsupported package, but now even that’s gone. Rescue disk wouldn’t read my disk. So I decided to migrate to LVM and ext3. For this task I used Knoppix 3.7 to resize the ReiserFS partition down, then used Ubuntu to make LVM and ext3 (because that version of Knoppix doesn’t include LVM, methinks). Then Grub got screwed up so I had to re-run grub-install; used CentOS 4 disc 1 in rescue mode for that. Moral of the story: if you like RH, or even Fedora distributions, save yourself hassle and keep away from ReiserFS. (On another note, ReiserFS may or may not work with SELinux. I kind of gather that patches to make ReiserFS and SELinux play nice are in the works and very close to being in the upstream—if they’re not there already. At least check it out if you’re interested in both of these things working together on the same system.)

Found a repository with Fedora Extras rebuilt for CentOS 4. Used DKMS package from there to get MPPE support in CentOS 4. Also replaced IPsec on my wireless side with OpenVPN which was incredibly simple. I love OpenVPN.

Thursday, 25 August 2005

pyMPI RPM

darkness @ 16:50:12

I’m taking a “cluster computing” class that’s going to be using MPI. The teacher said we had to use C/C++ because those are the only languages we have bindings for. To spite him I went ahead and installed pyMPI, which means I’ve made an RPM of pyMPI. Works for me on FC4. (That is, I can import mpi from Python and it works as expected. I haven’t actually done any MPI work with it, really.) It’s not clear to me whether I need to use the special pyMPI interpreter, or if I can use the normal Python interpreter. To be determined.

Saturday, 20 August 2005

SATA

darkness @ 10:11:31

Turns out, before I go full speed ahead shopping for controllers, I need to know what I’m looking for (go figure). “SATA” apparently doesn’t say it all, and “SATA II” doesn’t even refer to anything concrete, supposedly.

I originally thought I was looking for “SATA II” drives for that 300MB/s throughput. Once I found out I’d never hit that on a single drive, I was more interested in the features that supposedly accompany SATA II. Then I read stuff like “Dispelling the Confusion: SATA II does not mean 3Gb/s” and “SATA II still on rocky road”.

The long and short of it: SATA II was the name of the consortium (or working group, or whatever) that was supposed to come up with enhancements to SATA. Somehow their name started getting used as if it were a standard—but it doesn’t refer to any such thing. The SATA II group is now known as SATA-IO, I believe.

SATA can be 150MB/s or 300MB/s. It can support things such as native command queuing (NCQ), port mulipliers (PM), hot swap, and staggered spin-up. So look real close at any drive you buy to see if these features are supported.

For my purposes, either 150MB/s or 300MB/s is fine, since I don’t expect more than 70MB/s from a drive. In fact, I might enjoy using a port multiplier to put several drives on the same SATA channel, but port multipliers sound very hit-and-miss WRT driver, firmware, and hardware support; so I’m avoiding those. In the future sticking three or four drives on a 300MB/s SATA bus would be a nice way to get a whole bunch of drives in a system. (Just make sure the bus your controller is connected to can actually carry that kind of throughput to the machine!) Still, a controller that supports port multipliers is a plus; at least I’ll have the option for using them in the future.

I desire NCQ support, since a few benchmarks I’ve seen here and there seem to indicate it can provide a significant, if not quite substantial, speed increase. (Check out the MaxLine NCQ vs. no NCQ benchmarks on StorageReview.net for an example, I think.) There’s also tagged command queuing (TCQ) support I’ve seen on some controllers. I don’t think I’ve seen a drive advertise this feature yet, but I haven’t really looked. I seem to recall TCQ from SCSI. I don’t know if TCQ offers a performance boost, if TCQ is implicit in every SATA implementation, if TCQ and NCQ can be used together, etc. I know nothing about TCQ. I like acronyms, and I don’t think it’s a bad thing.

Hot swap support is nice, but not a requirement. Honestly I’m not sure I’d ever use it. I’m likely too afraid to pull a failed disk out a live RAID 5 array.

This brings up a side note: how afraid should I be of a two disk failure? Losing 2TB of information because of some odd power problem, for example, would piss me right off. On one hand I think about RAID 6. But I couldn’t get a full 2TB—and we all know how important that is—even with 300GB drives, since I’d lose two to parity. I’m actually thinking more of keeping actual backups. Maybe on DVD media. What might be fun is a system that pops out your DVD writer tray and lets you toss in a DVD whenever you think about it. When you put in a DVD, it keeps the disc in the drive and backs up as much new/changed (since the last disc) information as it can fit on the disc, then it pops it out. Later, you stow that disc away, pop in a new one, and the process keeps repeating itself. If you ever actually catch up with all the data on your disc, it just keeps the DVD in the writer. If I actually wanted to use a DVD writer more frequently in that system, it might be a problem that the backup software could theoretically always be using it; but I don’t predict I’ll be burning DVDs from my file server. One problem might be keeping an index of files: how does it know that a file on disc 14 was superseded by a newer version on disc 49? Maybe you could fit an index on a USB thumb drive, or just another hard disk, or a copy on every hard disk, or something like that. (Mirrored USB flash drives!) Just an idea. I wonder if anyone has made something like this.

Done with that tangent now. The summary: don’t say “SATA II—that’s good enough!” Look closely at exactly what the device (controller or drive) supports.

I’ll add a bit more about SATA physical interfaces. COOLDrives has a nice visual introduction to all of the different types of SATA connectors. (COOLDrives also seems to have a good selection of all things SATA, such as enclosures, controllers, cables, etc. I haven’t bought from them, haven’t comparison shopped their prices, but their selection is great and their web site is more or less clean enough to be usable. Lots of pictures helps. Kudos.) There’s your typical SATA connector, which I think of as an “internal SATA connector.” Then there’s your eSATA connector, for external connections. There’s also such a thing as multi-lane eSATA, which is really four SATA cables (that can do 300MB/s a second, each, so as to set your mind at ease) in one. You lose nothing, you gain fewer cables hanging around outside your box, less connectors, and maybe even a more sturdy connection from the looks of it. On the other hand, I haven’t seen many SATA controllers with eSATA multi-lane connections available. I have seen a few with, say, four eSATA (not multi-lane, but, er, “single-lane”) connectors on them. Now, perhaps you’ve got an enclosure that has these same eSATA connectors on it; in that case, no problem, you just buy a bunch of eSATA cables. Some enclosures I’ve seen at COOLDrives have the option of either multi-lane connectors or “single-lane” eSATA connections on them. What I am interested in finding is a cable that breaks a multi-lane connection, such as you find on an enclosure, into four eSATA connectors, such as you might find on a controller. (P.S.: sorry for the bad example, since I have no idea what they’re talking about with that “eSATA - SATA converter cable” in there. I just wanted to illustrate four “single-lane” eSATA ports. I think those are eSATA ports. Next stop, confusion central. All aboard!)

(P.P.S: After doing some more reading, Areca controllers document something called internal multi-lane on their ARC-1130ML/1160ML controllers. So I guess multi-lane can be internal or external.)

Update: OK, I didn’t want to go to P.P.P.S., but Areca confuses me. They’ve got external “Infinband” or “Infiniband” connectors, depending on what day you catch them on. It looks like the Infiniband connector is also known as SFF-8470. Searching around on this term led me to a list of cables sold by Adaptec where it looks like they have an internal multi-lane to 4 SATA “fan out” cable. Are internal and external multi-lane cables different? That connector on the COOLDrives SATA connectors page sure looks bigger, and has screws on it. COOLDrives also has a internal 4 port to external multi-lane adapter. They mention Infiniband here.

So here’s what I’m going to say: first of all, instead of Multi-lane I’m going to write multi-lane. Second, Infiniband is, as far as I know, a new high speed serial bus; it is a competitor to PCIe. I’m going to guess there are specifications for external Infiniband connections, and it just so happens that the SATA multi-lane connections use the same Infiniband connectors and cabling. I’ll go totally wild and further speculate (A.K.A. pull out of my ass) that they do this because Infiniband was very high speed, faster than SATA, and they had already figured out a design for these things that minimized interference/crosstalk/hamsters. It was a good, workable design that someone had already spent time on, and even developed equipment for, so they just went ahead and used it. By “they” I’m probably talking about AMCC/3ware, for starters, from everything I’ve read. Oh, and you can have Infiniband SATA internally or externally. I somehow doubt the connectors are 100% compatible. The internal ones seem to be described as “clicking” whereas the external seem to have the screws, as shown on COOLDrives. I should really just mail them and ask how you’d get from internal Infiniband to external Infiniband. The only way I can see is those fanout cables from Adaptec to the little converter linked above, and that seems dorky to “fan out” just to consolidate them back in on the other side of the face place, but with a different connector.

Now someone just figure out if I need to write “single lane” or “single-lane.” I actually think it’s the former.

Some links:

  • Serial ATA (SATA) Linux status report has good information on the status of support for a particular controller/chipset in Linux, but it also has details on controller capabilities. I used this to find out the Promise SATAII150 SX8 (why do you think I started looking into the definition of SATA terms? That confusing name) supports NCQ, hot swap, PM, etc.
  • The “TechTarget network,” despite sounding like a branch of a well-known chain of discount stores in the US, has several articles with potentially useful information about SATA (see also article linked at beginning of this entry from them):
  • What’s in a name? SATA II Misconceptions
  • COOLDrives has an SATA multi-lane FAQ with a little more information on that connector type.

For my future reference, a list of SATA controller manufacturers (taken from handwritten notes in a margin, so don’t trust this to be correct, let alone complete): Proximity Data, LSI, Promise, Areca/Tecram, Adaptec, Highpoint, Sonnet, Addonics, Supermicro, Intel, Newcell.

Friday, 19 August 2005

PCI vs. PCI-X vs. PCI Express

darkness @ 23:51:58

This is part of a series of entries about building a new 2TB array. Or trying to, at least.

First thing I had to sort out was all of my I/O options.

  • PCI: my current array sits on three HPT302 ATA/133 controllers. All on the same PCI bus. This is not a special PCI bus. I suspect it is your usual 32-bit, 33MHz PCI bus. There are, apparently, 66MHz PCI buses, and I’m quite certain there are 64-bit buses. There are 5V PCI slots/devices and 3.3V PCI slots/buses. I suspect I have the later. Anyway, PCI bus bandwidth: in the realm of 132MB/s. Five drives, even at, say, 60MB/s each: 300MB/s. Not so good.

    (Of course, what’s even worse are the HPT302 quasi-open-source drivers. I think recent kernels don’t need them, but at the time of RH8.0 I certainly needed them. From time to time the drives “stall”—I just don’t know what else to call it. Everything blocks for 10-20s, then it comes back and all is right in the world. Plus it’s slow according to hdparm, but I don’t really trust hdparm.)

  • PCI-X: I’d say it’s PCI 2.0… but that’s really just the new version of PCI. Most everything you get these days in regular ol’ PCI is going to be PCI 2.2, I think. No, PCI-X is (in my words) an evolution of PCI. Cards and slots are both backwards compatible. PCI-X running at the same width and speed as PCI, is supposedly still faster than PCI.

    You’re most likely to find PCI-X to be 64-bits wide. I don’t think I’ve seen a current motherboard that has 32-bit PCI-X slots.

    PCI-X can run at 66MHz, 100MHz, or 133MHz. Technically, some newer version of PCI-X can run at 233MHz or even 533MHz, but I’ve never seen one faster than 133MHz.

    If you’re keeping up with math, you may have noticed some nice numbers: PCI-X 64-bit/100MHz = 800MB/s, PCI-X 64-bit/133MHz = 1GB/s. Thems fast. There are limits though:

    • A 133MHz bus can only support one device. A 100MHz bus can only support two devices. I believe a 66MHz bus can support four devices. On some motherboards I was looking at, things like the gigabit NIC would be on the same bus as one of the 64-bit/133MHz slots. So if you put a card in that slot, surprise! Your NIC and your device are both running at 100MHz now.

    • PCI-X is backwards compatible with PCI, and that goes for both slots and devices. This sounds nice, until you (almost immediately) realize that it means everything on a bus is going to run as slow as the slowest device. In reality, I read one source that said a PCI-X card on a PCI bus will always fall back to 33MHz-66MHz PCI isn’t an option for a PCI-X device. So, I’ve got three PCI PATA controllers I’d like to keep, and (for example) two new PCI-X SATA controllers I’d like to add. I need a board with at least one 64-bit/100MHz PCI-X bus with two slots (and just two slots on that bus), and then another three PCI slots on top of that. Hope you’ve got on-board NIC and video.

    The vast majority of motherboards I’ve seen don’t really seem to give me enough I/O options. I really want something insane like four PCI-X 133MHz buses with one slot each, and then a few more PCI slots. Maybe a PCIe (see below) or two. Oh, and under $200 too, please. I might go $250 if I have to.

  • PCI Express: not to be confused with PCI-X! Yes, they’re different. Genius naming guys. “Like, these guys couldn’t have devised names that sound just a little more unique than ‘PCI-X’ and ‘PCI Express’?”

    Whereas PCI-X is based on PCI, PCI Express (PCIe) is apparently quite different. Serial instead of parallel, and capable of several metric assloads more speed. Slots and devices are no where near backwards compatible AFAIK. Also, PCIe used to be called 3GIO, in case you run into that term. PCIe and 3GIO are the same thing.

    PCIe is usually specified as “x1,” “x8”, etc. This specifies the number of “lanes” that the slot offers. A lane is a bi-directional serial channel capable of around 250MB/s in each direction (sending and receiving). So an x16 slot should be 4GB/s in each direction. It took me a little while to realize this was in contrast to PCI, which goes in one direction at a time. So where your PCI-X 64-bit/133MHz slot has 1GB/s bandwidth, your 4x PCIe slot really has 2GB/s (1GB/s transmit, 1GB/s receive, simultaneously).

    I think PCIe also offers some other features, like hot swap. It seems poised to be “the expansion bus of the future.” Of course, the main class of peripherals I see using it currently are graphics cards. I suspect this is often why you find a motherboard with one PCIe 16x slot and one PCIe 1x slot: the 16x is for your video card (P.S.: AGP 8x does 2.1GB/s! Let me stick a storage controller in that slot, baby) and the 1x slot is to make you feel slightly more manly. Or maybe it’s for a GB NIC.

This study of buses was necessary for me to decide what kind of storage controllers I’ll be interested in. Accordingly, I plan my next installment to cover the various options one has for SATA controllers.

I do have one question remaining from all of this talk of buses: what’s the speed of the connection between the various I/O controllers and buses and what not? Hopefully no one is running a 4GB/s bus to your expansion card, but a 1GB/s bus back to the rest of the system. I’d have a little trouble figuring out the point of that, probably.

Some links on the above topics:

Thursday, 18 August 2005

Putting together a new array

darkness @ 09:39:35

My birthday is coming up. I’m not sure if I have enough variety in my wish list, so I was thinking about putting a new disk array on there. My current 480GB array is at something like 91% full.

Naturally the first thing that comes to mind is SATA. Cheap and real fast. Of course, then I find out an interesting fact: most IDE drives can’t sustain more than about 70MB/s anyway. I had no idea, really. When they’re serving from cache, sure, they can do way faster than 70MB/s, but how big is your cache? 8MB? 16MB? I gather that cache, especially in a file server, isn’t terribly important.

Moreover, if you look at the reviews on StorageReview.com you’ll see that almost every disk in the same capacity/RPM has almost the same performance. Hence lots of people in their forums telling you to look for best price or warranty, instead of performance numbers.

That was eye opening. Plus, it took away my requirement for PCI Express (henceforth PCIe). I figured I’m going to have 8-9 drives, 8 * 300MB/s (SATA II) = 2400 MB/s. I needed PCIe. All I was finding, though, were PCI-X controllers, 64-bit/133MHz at best. Very few people made PCIe controllers, especially with more than two SATA II ports–if SATA II at all. Now I realize, though, that I should plan for something more like 8 * 70MB/s = 560MB/s, which is well in the reach of 64-bit/133MHz PCI-X, which should work out to theoretical maximum throughput of about 1GB/s. That even leaves enough room for PCI-X gigabit NIC if I need one (i.e., if it’s not on-board).

My brain is awash with my research of the past couple days. I just realized I need to organize it a bit more, to present all the interesting products, sites, standards, and other information I found. So I’ll put off more information until a future (near future, hopefully) entry.

I’ll add just a couple more random things:

  • I’m almost positive I can put together 2TB and a new server for way, way less than I could do it from, say, Dell, HP, or IBM. I like IBM’s server web site the best, though, especially their server finder. This may drive me to buy from them in the future. (That and their prices and servers look pretty damn good.)
  • Linux SATA support still looks a bit shaky. Only a few chipsets, for example, seem to support hot swap, port multipliers, etc. All the interesting chipsets have bare minimum support, sometimes from the vendor, sometimes from the community. Here I’m talking about the SIL3124 (no PM, no hot swap, according to comments in the latest libata patches) and Areca/Tekram (Marvell?) cards especially. I’ll talk more about them later, but if you want an SATA expansion card with good Linux support, my initial findings suggest you’re best off going with 3ware who release their own drivers which seem to be subsequently included in the kernel.
Next Page »

Powered by WordPress