Bufferpools and RAM Commit/Deliver

Uncategorized No Comments »

Bufferpool config is an often overlooked issue due to the rarity in which it nails you, but it can be important in those rare cases.

a Bufferpool is simply a resource limitation on a collection of RAM — typically this is a buffer, ie in-RAM space, it cannot be swapped-out because it represents in-flight transactions, uncommitted pages, or pre-fetched content that will be needed very soon.

a Commit is the RAM that is offered to a process — in glibc, this can default to 2G. This doesn’t say that every process automatically consumes 2G of RAM, but that the Kernel offers up to 2G to the process. Recall that due to sparse garbage in RAM pages, RAM offered to a process is NEVER reaped by the system back to the common pool.

A Commit can be dangerous when the OS over-commits RAM in a long-lived environment: if up to 100G is offered on a system with only 32G, you can see how if many threads grow their demand for RAM, the system will swap out some processes to meet demand. This is a typical action in a multiuser system with swap active (NOTE: Motorola tuned their commits on smartphones because on diskless systems, there’s obviously no swap)

In a long-lived database process, if the bufferpool is a commit, then it will soon grow to maximum commit. It can never be swapped out unless the database has bursty use-cases and has no active sessions for long periods. The bufferpools configured may be in addition to the heap space taken up by the process itself (in un-pooled resource space). The database itself may limit bufferpools, but consume a number of GB over the configured bufferpool space.

The other applications on the system also can demand up to their committed RAM — why limit one process while letting the others run amok on your server?

On long-lived systems, committed RAM becomes allocated and consumed RAM. Bufferpools need to be configured, and RAM usage monitored (or at least traps/exceptions raised when a critical DB starts swapping, an indication that review is critically needed)

Bufferpools and Commit/Demand discrepancies are silent but deadly killers, like the sharks and heart-disease of the resource-management domain.

VirtualWisdom UDCs for Script Variables

howto No Comments »

I worked with a user of a SAN monitoring product called VirtualWisdom, and it allows the user to define a matric for devices based on filters — basically, evaluating unrelated filter expressions in order, stopping at the first hit found, and providing a “catch-all”.

We wanted to allow the SAN administrator to keep migrating content off EVAs to remove an interposing virtualization product, with alerts to indicate when some content was in critical state without the admin having to continuously poll it. We wanted to work with user-visible response-time of the SAN (ie keep response time below 8ms) but had to settle for Capacity/Utilization metrics.

On our case, we looked at running evaperf on a disk array showing poor throughput — as soon as the user-visible response time drops (we call this “Exchange Completion Time”), the evaperf is run against that array. The problem is that the name of the array (ie the parameter to evaperf that indicates which array) is uppercase, and doesn’t match the various names of users ( ie servers named such as oradb07a2) or arrays (arrays named such as westeva8). We used UDC:

  1. if the ITL target is “wdceva1”, UDC “EVA” value is “EVA01”
  2. if the ITL target is “edceva2”, UDC “EVA” value is “EVA02”
  3. if the ITL target is “wdceva3”, UDC “EVA” value is “EVA03”
  4. if the ITL target is “edceva4”, UDC “EVA” value is “EVA04”
  5. if the ITL target is “westeva7”, USD “EVA” value is “EVA07” (yes, change in naming format)
  6. if the ITL target is “easteva8”, USD “EVA” value is “EVA08”
  7. … etc …
  8. otherwise, UDC value is “EVA00” as a catch-all

What this allowed us to do is create an alarm:

  1. groups by link, channel
  2. filter: EVA is not “EVA00” to avoid running the script where we don’t know the EVA
  3. if utilization exceeds 80%, run the evaperf external-script
  4. script parameters include $EVA$, which is defined as the value of the UDC “EVA” when the alarm triggers
  5. the “re-arm” stage sends an email to alert the administrator that evaperf was run and there is a report to pick up

This alarm forms the basis of an ECT-based alarm, but the metrics do not coincide, so we’re still looking at how to do that.

Currently, the administrator can apply this alarm to all switches, and as EVA utilization gets too high, the alarm automatically runs the evaperf tool to show what LUNs are being heavily used. The re-arm is set fairly long so that the running of evaperf (which we expect to cause some slowdown in overall processing) doesn’t get detected (via performance impact) as ANOTHER reason to run the evaperf script.

This alarm as it is allows the administrator to focus on migrating content off the EVAs, but proceed at an orderly pace until utilization on an existing link needs to be distributed off to another link — and the evaperf tool tells him which LUNs to move.

SAN Aliases for WWNs from Zonesets: Voting

howto No Comments »

I had a problem:

1) the switch I’m looking at has no aliases/nicknames for WWNs
2) the zonesets include names, but no ordering
3) produce tuples of {WWN, alias} with no dupe WWNs or aliases with most likely pairs

The input looks like:

Active Zoneset:
  Zone: FAB12SW33_ORAC4_HBA0_0899_FA_4CA
    ZoneMember: 10:00:00:00:C9:7D:B5:04
    ZoneMember: 50:06:04:8A:D5:31:AC:23
...
...
  Zone: FAB12SW33_ORAC4_HBA1_0899_FA_4CA
    ZoneMember: 50:06:04:8A:D5:31:AC:23
    ZoneMember: 10:00:00:00:C9:7D:B5:05
  Zone: FAB12SW33_ORAC4_HBA1_0899_FA_13DB
    ZoneMember: 10:00:00:00:C9:7D:B5:05
    ZoneMember: 50:06:04:8A:D5:31:AC:27
  Zone: FAB12SW33_ORAC4_HBA1_0899_FA_14DB
    ZoneMember: 10:00:00:00:C9:7D:B5:05
    ZoneMember: 50:06:04:8A:D5:31:AC:27
    ZoneMember: 10:00:00:00:C9:7D:B5:04

The intended output would be like:

10:00:00:00:C9:7D:B5:04, ORAC4_HBA0
10:00:00:00:C9:7D:B5:05, ORAC4_HBA1
50:06:04:8A:D5:31:AC:23, 0899_FA_4CA
50:06:04:8A:D5:31:AC:27, 0899_FA_13DB

You can see how the Zone has to be chopped up, and the ZoneMember items are not ordered… but because you can see slight correlation between groups with more of a WWN and its Alias, I chose a voting algorithm:

1) the first pass simply cleans things up, chops out the Active Zoneset
2) second pass tries to order the Zone name, normalize the format
3) third pass breaks apart the Zone name and produces a weighted-vote:

3a) the weight of a set is 100, so if there are three WWNs, each gets 33; two WWNs, 50 each
3b) each aliases gets a weighted vote from each WWN
3c) vote/ballot totals are tallied in a { WWN, alias } == total_votes format
4) fourth pass orders the tuples by highest votes first
5) fifth pass removes dupes and outputs the tuples with the highest vite where the WWN and alias have not been seem before

The result, unfortunately, was only 500 aliases, and took 4 hours of work; it’s completed in awk. I’m sure someone will do this in perl, or Java, but awk was my portable tool. I should be able to throw zonesets form all fabrics at this same tool from i10k switches in the future.

Checksums on Transfers

howto No Comments »

I used to send people checksums for downloads; recently, we seem to need to do this as a company.

Whenever a file is uploaded at the EU FTP server that I manage, the server sends me a sanity-check email something like:

size: 2628153754
user: abc_cheese
md5sum:      23fe22d1afad721740c0178b6ab842b0  /home/abccheese/backup-2010-11-29-12-21.zip
BSD sum -r:  38354 2566557
SYSV sum -s: 43793 5133113 /home/abccheese/backup-2010-11-29-12-21.zip

Processing archive: /home/abccheese/backup-2010-11-29-12-21.zip

Error: Can not open file as archive

When I was having to double-check uploads, I found that it was easier if the server itself told me when the upload was finished, and better, told me also if the file is OK. It doesn’t necessarily say the file is complete, simply that if it’s certain files, “does it look sane?” Zip files and 7-zips should pass a zip-defined sanity check, for example.

On an old FTP server, I have enabled the “SITE” command to allow checksums — which I later had to optimize to return a pre-calculated checksum (using “make” logic to update on changed files) in order to avoid a DoS on the server when the checksum took too long to generate. The intent was to allow a random user to calculate the checksum of a file to ensure that transfer was successful to reduce the possible errors when “something didn’t work out”… like an XML schema, it confirms that “the FTP server made specific delivery: the obligation of providing a file accurately was performed” just as a schema splits the “where did the error occur?” question in half.

In the “SantaSack” project, every file was a checksum. Yes, I had to do collision-avoidance in an MD5 signature storage. I joined the “mysqlfs”project as a replacement of SantaSack — with the intent of developing a layer that pre-calculates MD5s and SHAs asynchronously on change, storing them as file attributes for later query. I’m still considering that for MDS on OSX.

My company is looking into checksums on transferred files, now; it seems self-gratifying in an arrogant sense to see them crossing ground I’ve been over, but I regret that I’m not better-prepared.

MD5 checksum is everywhere: UNIX (md5sum {file}), BSD derivative of UNIX (md5 {file}), Windows, and cross-platform in Java: (martin: 2010-07-28_13:49:49):

        static String getMd5Digest(String pInput)
        {
            try
            {
                MessageDigest lDigest = MessageDigest.getInstance("MD5");
                lDigest.update(pInput.getBytes());
	        BigInteger lHashInt = new BigInteger(1, lDigest.digest());
                return String.format("%1$032X", lHashInt);
            }
            catch (NoSuchAlgorithmException lException)
            {
                throw new RuntimeException(lException);
            }
        }

All of MD5 routines (both opensource and the RSA version) are a case of starting with a basic signature, updating it with a variable-length buffer of entropy, and reporting the resulting value. This can be done on a buffer as it’s used for other things, which I’ve done: the ftpkgadd tool, which was a pkgadd that worked from FTP URLs rather than filenames (connecting the socket for the GET to the inbound stream of the pkgadd decompressor) — this could be done similarly in a layer such as a compression that also MD5Update()s a buffer when compressed, or when written to the output stream. In this way, the checksum is ready when the archive is, at little additional cost.

MD5 is fairly ubiquitous, but sadly I don’t have much of this implemented anymore, save the upload-sanity-check on the FTP server.

Twice is a Bug

Uncategorized No Comments »

If a problem happens once, it’s (un)lucky: things just happen, some things are very rare, and fixing them is not economically viable.

If it happens twice, it’s a bug, be it hardware, software, or meatware (users / processes).

Dishonourable mention for the bugs that rarely happen, but require a 5-alarm firedrill to diagnose, and makes a company look really, really bad 🙁

If you consider it, even the Software Architect who never talks to customers until their environment is very stable, he’d have to agree: if something happens twice, even if I’m a genius and it never happens to me, it must be more possible than alignment of the planets, so should be considered. If users keep doing the same problem, maybe they have other habits than what I have, and maybe should be considered worthy of helping rather than ignoring.

If the glorified calculators on our desks are more capable of checking for that error, then why aren’t they? (That’s a key tenet behind the Smallfoot project: use the software to do what software’s good at).

I just saw a bug in a release of our product, I think it’s handled in some of the work on the later major revision, but I’m not sure. I don’t want to file until I know, as that wastes developer resources to tell me I’m an idiot (I’ve been an idiot many times, but developer resources are quite valuable in my books). I don’t want to forget to check, but damn, there’s a lot of stuff that happens in my workday, and my memory is fairly sketchy (poor-quality meatware).

Maybe it’ll happen a third time. Thrice is definitely a bug.

NetFlix Outside USA: AppleTV VPN

Uncategorized No Comments »

Put a VPN on your AppleTV to make it connect from an apparently-USA IP address to get full access to your Netflix subscription.

This is what I said as a solution to the problem of traveling in other countries, taking your new AppleTV with you (let’s not ask why you’d pack that over, say, a helmet, or a SCUBA reg), and accessing the full line of USA Netflix content. This is all with the intention of getting access to the USA content on Netflix (with a proper, USA paid subscription) while traveling.

So I said “put a VPN on your AppleTV, and connect through there to stream content”. This also requires that the system you’re streaming through has sufficient bandwidth to both send and receive a copy of the streamed packets. I would not recommend streaming from a residential gateway on the end of a cablemodem, for example, because of the asymmetric inbalance in upstream/downstream data rates and latencies.

So we’re mostly following the FireCore Newbie Procedure:

  1. Download the latest Pwnage (v4.1.2)
  2. Download a compatible AppleTV version 2 image (v4.1 4M89)
  3. Create a jailbroken image, which should offer ssh access
  4. Use ssh to configure the stripped-down OSX on the AppleTV to connect VPN

This is very much like streaming the UK BBC Player to watch soccer outside the UK — because when you travel, you want/need/must have access to your soccer. Yes, I’m talking about you, Cannoli.

MySQL Replication Walkthru: Activate Secondary

dataflow, howto No Comments »

After Enabling Replication and Making a Remote Backup), we can activate the secondary server.

Already, our Primary has been returned to service, and we don’t really need to alter it from this point forward; all our work after enabling replication has been on the Secondary. We’ve saved our remote backup on Secondary’s disk, but not yet started the secondary server.

We will:

  1. start the Secondary (skip-start-slaves is still in the config file)
  2. Configure the replication
  3. Import the backup file
  4. Start the Replication process on the Secondary
  5. Configure the Secondary so that it will always start the Replication automatically

Our servers:

Primary: Sleepy (192.168.44.3)
Secondary: Doc (192.168.44.4)
MySQL: 5.1.41-enterprise-commercial-pro
OS: Windows 2008R2

Start the Secondary
We still have the option “skip-start-slaves” in our my.ini (my.cnf), and now we’re going to start the server; this is as simple as using the Windows services.msc to start the MySQL service.

On a Unix-like server, /sbin/init would start the service, or if you’re in the SysVinit-monstrosity, /etc/init.d/mysql* start or something similar would start the service. We can discuss why a script would reside in /etc/ some other time when the SystemV documents and the Linux FSH are both present (hint: violates both; config files only)

Configure the Replication
Replication can also be configured using the config file, but I did it using the CLI, as follows:

(On Doc/Secondary)

mysql.exe -u root mysql
mysql> CHANGE MASTER TO MASTER_HOST=’sleepy.example.com’, MASTER_USER=’repl’, MASTER_PASSWORD=’R3plPassw0rd’;
secondary:mysql> SHOW SLAVE STATUS;

(On Doc/Secondary)

mysql.exe -u root mysql
mysql> CHANGE MASTER TO MASTER_HOST=’sleepy.example.com’, MASTER_USER=’repl’, MASTER_PASSWORD=’R3plPassw0rd’;
secondary:mysql> SHOW SLAVE STATUS;

The output of “SHOW SLAVE STATUS” should have a proper Host, user, and pass, but the logfile and log position will be fairly incorrect.

Import the Backup File

We stored the backup file we made as repldbdump.db. We should already have a user on the local server that can insert and import (by default, “root” can do this), and we’ll import it using:

(On Secondary/Doc)

mysql.exe -u root < repldbdump.db

Another benefit of storing this backup/dump with the “master” options is that it will correct the laster log file and master log position for us. A repeat of “SHOW SLAVE STATUS” should show that the MASTER_LOG_FILE and MASTER_LOG_POS in the repldbdump.db has set things right.

Start the Replication process on the Secondary

We have the data loaded from the Primary, and we have our replication configured. The import has configured our replication binlog files and positions, so we’re ready to start.

(On Doc/Secondary)

mysql.exe -u root mysql
mysql> START SLAVE;

Configure the Secondary so that it will always start the Replication automatically

Finally, we can remove the config entry we put into our Secondary so that it would start with a crippled Replication config; comment out the skip-slave-start in your my.ini (my.cnf):

(Secondary/Doc)

[mysqld]
...
server-id=2
#skip-slave-start

There’s no need to restart the Secondary, but if it does restart, it will automatically get back into replication.

MySQL Replication Walkthru: Making a Remote Backup

dataflow, howto No Comments »

After Enabling Replication (or if you are not using replication, then right after Enabling Network Access), we can use this config to make a remote backup of the database.

Our servers:

Primary: Sleepy (192.168.44.3)
Secondary: Doc (192.168.44.4)
MySQL: 5.1.41-enterprise-commercial-pro
OS: Windows 2008R2

Use Remote Access to Pull a Remote Backup

By this point, you can connect to your server remotely to make queries; now we want to pull a backup.

(on Secondary/Doc)

mysqldump.exe –u repl -h 192.168.44.3 --password=R3plPassw0rd –add-drop-table –all-databases > repldbdump.db

In you’re setting up replication, then you’ll want the additional replication content provided by using this command instead:

(on Secondary/Doc)

mysqldump.exe –u repl -h 192.168.44.3 --password=R3plPassw0rd –add-drop-table –all-databases  –master-data –host=vi-sleepy.vi.local > repldbdump.db

The benefit of this additional data is that it sets the replication master file and log position for when you continue to Activate the Secondary

MySQL Replication Walkthru: Enable Replication

dataflow, howto 2 Comments »

After Enabling Network Access, we can Enable Replication before Making a Remote Backup of the Database. If you’re reading this to simply make recurring backups of your MySQL remotely, then you can ignore this step.

In this step, we’ll

  1. assign IDs to the Primary and Secondary
  2. restart the primary
  3. proceed to backup the database before starting the Secondary

Primary: Sleepy (ID = 1)
Secondary: Doc (ID = 2)
MySQL: 5.1.41-enterprise-commercial-pro
OS: Windows 2008R2

Assign ID to the Primary
While assigning a replication ID, we can also define the binary log for replication; I did this using two parameters into the my.ini (my.cnf) file:

(Primary/Sleepy)

[mysqld]
...
#bind-address=127.0.0.1 # commented to bind to all interfaces
log-bin=”E:\Data\repl-bin”
binlog=format=ROW
server-id=1

(Secondary/Doc)

[mysqld]
...
server-id=2
skip-slave-start

Start the Primary server, but don’t start the secondary yet. Note that “skip-slave-start” is there as opposed to running the Secondary with the option “–skip-slave-start”, which is difficult to do using Windows’ service stop/start. This option is only there for the first run of the Secondary.

(On Sleepy/Primary)

mysql.exe -u root mysql
mysql> GRANT REPLICATION SLAVE ON *.* TO ‘repl’@'%’;
mysql> FLUSH PRIVILEGES;

You should notice that when the Primary server starts up again, it begins creating E:Datarepl-bin.index and E:Datarepl-bin.000001 files

MySQL Replication Walkthru: Enable Network Access

dataflow, howto 1 Comment »

In my replication setup, I needed to make a backup, and I needed to enable TCP/IP access eventually, so I did them as a single step.

Primary: Sleepy
Secondary: Doc
MySQL: 5.1.41-enterprise-commercial-pro
OS: Windows 2008R2

In order to allow Doc (Secondary) access into Sleepy (Primary), Sleepy had to accept remote TCP clients. The process for that was:

  1. create a username/password pair for a new remote user (wildcard host, or a specific host)
  2. configure MySQL to accept remote client access

Create a User/Pass Pair

I wanted to ensure I could access the server’s authentication, so I restarted with “–skip-grant-tables”. In my case, I added this to the WINDOWSmy.ini, but Linux and Unix-like users (including BSD) might find /etc/my.cnf or /etc/inet/my.cnf. My config looked like:

(On Sleepy/Primary)

[mysqld]
...
...
skip-grant-tables
...

Restart the server.

Next, I connected and ran a GRANT command:

(On Sleepy/Primary)

mysql.exe -u root mysql
mysql> CREATE USER 'repl'@'%' IDENTIFIED BY 'R3plPassw0rd';
mysql> GRANT SELECT ON *.* TO 'repl'@'%';
mysql> FLUSH PRIVILEGES;

Note: “mysql.exe” is obviously “mysql” on Unix-like systems. “%” is a wildcard in MySQL’s world.

The FLUSH PRIVILEGES is not quite necessary because in our next step, we’ll be restarting the database anyhow.

If you cannot connect, check that the unix-socket is present, and check for socket configs in the my.ini (my.cnf).

Configure MySQL to Accept Remote Client Access
In order to open up the external port (which might already be open, depending on your configuration), I commented out the bind-address in my my.ini (my.cnf) config file:

(On Sleepy/Primary)

[mysqld]
...
...
#bind-address=127.0.0.1
skip-grant-tables
...

After I restarted, I noticed that I could connect using “-h 127.0.0.1” (as I could before) but also using the external address (192.168.44.3):

(On Sleepy/Primary)

mysql.exe -u root -h 127.0.0.1 mysql
mysql> exit
mysql.exe -u repl --password=R3plPassw0rd -h 127.0.0.1 mysql
mysql> exit
mysql.exe -u root -h 127.0.0.1 mysql
(fails, as expected)
mysql.exe -u repl --password=R3plPassw0rd -h 192.168.44.3 mysql
mysql> exit

(On Doc/Secondary)

mysql.exe -u repl --password=R3plPassw0rd -h 192.168.44.3 mysql
mysql> exit

If you cannot connect with “-h 127.0.0.1”, check that the “bind-address” is defined properly or absent completely from the my.ini (my.cnf) file, and that you have restarted the server since you made that change. “netstat -n” would confirm whether mysql is listening on port 3306. “telnet 127.0.0.1 3306” or “nc 127.0.0.1 3306” would confirm whether MySQL is available on that port (or something else is).

If you cannot connect with your external IP address, check that you have the right address, and confirm (using telnet or nc) that you have a service responding there.

If that works fine, comment out your “skip-grant” and restart, then recheck with the same OS-level mysql(.exe) statements as above. Connectivity should work and fail as above.

(On Sleepy/Primary)

[mysqld]
...
#bind-address=127.0.0.1
#skip-grant-tables
...
WP Theme & Icons by N.Design Studio
Entries RSS Comments RSS Log in