Dec 20

Teo phones taken from chargers this morning at 8:00: the latest HTC Android, and an iPhone 4, not a 4S. The iOS device has all manner of 3G, wifi, location services activated, the HTC perhaps the same.

IPhone is 80% battery, HTC is 57%. Sure, optimal conditions rate the content the other way, but we all know our wifi access points will hit 300 feet / 100meters range in an optimal situation: real life differs, and for my usage, HTC with MAYBE the same services active is burning battery twice as quickly.

Not sure Android is winning yet.

Dec 14

So I suddenly saw this:

Fatal error: Call to undefined method WebRequest::getIP() in extensions/ConfirmEdit/Captcha.php on line 202

Apparently, this is due to rev 106097, which replaced wfGetIP() with a wgRequest->getIP() that doesn’t exist. Maybe it’s in Yesterday’s version of WikiMedia only. :(

My fix:
cd extensions/ConfirmEdit
svn update -r 106096

I’m putting this blog entry so that others may see it and make use of it.

Dec 07

Where possible, try to avoid using “MATCHES” expressions in Filters that are evaluated often; one suggestion is to move them to UDCs, but it’s not necessarily a constant rule.

I’ve used a few terms in that one-line suggestion, perhaps I can expand on this a bit.

VirtualWisdom lets you make filter expressions such as:

Attached Port Name MATCHES ^OracleServer_*

This powerful logic lets you leverage similar names and terms to select similar servers. Consider selecting similar storage targets or hosts by parts of names, or FCIDs that start or end in the same sequence, or switches using the word “Core” or “Edge” in its role. In fact, a simple filter applied to an alarm can apply a more urgent reaction to a port with errors on a core switch rather than an edge, representing different SLAs or criticality.

The example above says “look for where the Attached Port Name — the nickname of the device attached to a switch — starts with ‘OracleServer_’ “.

UDC — User-Defined Context — allows a VirtualWisdom Administrator to define an additional metric in terms of filter expressions: when various conditions match, a constant enumeration is used for that port’s value, or that ITL’s encoding. For example, for switches with certain names, a “DataCenter” column can identify where that switch is to help forward physical layer errors (such as CRCs) to the right team to more quickly address the issue. Different storage or servers involved in different business units can be enumerated, and based on that “BU” flag or value, different SLAs may be applied, or different teams alerted. UDCs are quite powerful, and are processed on every summary that gets stored in the database.

UDCs can use the same “MATCHES” terms that standard filters can use.

The problem with MATCHES is that it strips away some optimization: the Query Optimizer is a part of a database that cross-references the client’s query with existing possible indices, even aggregate indices, to reduce the processing load by orders of magnitude. Any Oracle Admin who has spent time with the “SQL EXPLAIN” has seen the difference a simple re-ordering of expressions can make in a complex query to get a more efficient join, or fewer rows evaluated for processing to reach a result. These indices only match constant expressions with basic comparison operators such as “==”, “!=”, ““, and are completely inefficient for fuzzy or regular-expression matches.

A “MATCHES” expression in your filter or UDC can increase the load between a VirtualWisdom Portal Server and the underlying MySQL database engine. Although Virtual Instruments Engineering has worked to improve the database schema and queries, resulting in dramatic improvements in processing efficiency and maximum ITL and port count of a Portal Server, we the users still have the power to ruin this with a heavy expression or two.

If a filter isn’t run very often (such as a private dashboard, or a filter used mostly in a daily report), it may not pose very much load on the database; conversely, for a filter that runs often, constantly, the load of a MATCHES expression can repeatedly affect the server for the same data points. It’s almost as though a cache of the resulting filter would avoid rerunning the comparison so often. That is where a UDC can be used.

For filter expressions that run often, consider moving the MATCHES to a UDC calculation, and convert the filter to a comparison against that precise value. For example, if your filter looks like:

Attached Port Name MATCHES BillingServer_* OR Attached Port Name MATCHES CustRecords_*

This can be converted to a UDC such as:

  • default value: “Other
  • value “Billing” when “Attached Port Name MATCHES BillingServer_*
  • value “Records” when “Attached Port Name MATCHES CustRecord_*

This sort of UDC means that the two MATCHES expressions will run twice on every Port or Exchange of every summary. If only Servers are identified by this pattern of nicknames, you could also avoid this sort of evaluation on non-Servers by the following:

  • default value: “Other
  • value: “Other” when “Attached Device Type != Server
  • value “Billing” when “Attached Port Name MATCHES BillingServer_*
  • value “Records” when “Attached Port Name MATCHES CustRecord_*

In general, if a MATCHES is rarely evaluated, then its load — however heavier — only affects the server at rare times, so in total has a lower effect. A 100-fold heavier query run only weekly is not worth swapping for a UDC expression run every five minutes.

Try to consider each case where MATCHES is used for conversion to a UDC expression, and whether even that evaluation can be avoided by a constant expression evaluated before the MATCHES expression. Your portal server will thank you!

Nov 28

When presenting data, try to include some sense of quality or accuracy, even if it’s just a flag “I derived this” or “I got this from a very accurate source” or “this is a space-filler”.

I wanted to highlight something I saw quite interesting in Axeda Corporation‘s Gateway and Connector technologies: Quality of metrics. Axeda uses an enumeration of simple qualities (Good, Bad, or Unknown), and this could theoretically be used when choosing which of two conflicting data types to show.

The simple act of collecting and summarizing metrics is not necessarily made easier when the precision meta is tracked, but it can help the end-user make better decisions based on this data: if you see an aberrant data point, do you know it’s seriously out-of-norm and needs to be acted upon, or is it based perhaps on a ratio with a questionable denominator, and should be taken with a bit of skepticism?

Consider precision, or at least define why it’s out-of-scope for your work.

Nov 23

I recently worked with a VirtualWisdom Administrator who wanted to group his ISL port utilization to match his ISL trunks, so we worked out a method of doing this, and I wanted to share it.  As a Field Application Engineer at Virtual Instruments, I tend to focus on these lower-level “how to” issues, working with users to achieve the data representation they need to make informed decisions in lieu of guesses and rule-of-thumb.

Initially, this administrator and I spoke of “trunks”, but between Brocade and Cisco terminology, these mean different things. The aggregations of ISLs into single logical units are “Trunking” in Brocade, but “Port Channels” in Cisco. Trunking E ports in Cisco is a different thing. I’ll use “Aggregate” as much as possible to refer to this in terminology as vendor-neutral as VirtualWisdom is.

We discussed why this Admin wanted to see more than just “the top 20″ values on a list of ISLs.  Diving deeper, this was because the top 24 entries are all the same aggregate: essentially, the first entire page is taken up by channels 1 and 2 of a a single aggregate ISL.  He wanted to skip beyond this to see the next 20 or 40 ISLs so he could see which ISLs were getting near 90% utilization.  So… why not combine these into a single filter or expression that matches the aggregate, and make sure that each aggregate uses only one row of this resulting table?

Additionally, one switch vender implements this aggregation of ISLs as balanced utilization across a collection of links between the same endpoints; conversely, another vendor implements this by overflowing one container, then moving to the next. In essence, an abstract aggregation that is has 60% Utilization may look like a collection of ports or links each with utilization 60%, or might look like 6 out of 10 links with Utilization 100%, and 4 of 10 with 0%.  That’s very difficult to separate in the data, and can obfuscate which ISL aggregations are approaching maximum desired load.

VirtualWisdom’s focus is on ports: what type, attached to what, etc.  Using Aliases or Nicknames, we can describe the endpoint, and in VirtualWisdom 3.0.0 and later, those ISL Nicknames are determined for us.  Unfortunately, these all have switch/blade/port, they’re too detailed. We cannot use that combination for a “group-by”expression to separate out the ISL aggregate.

VirtualWisdom is “too detailed” in this case: it wants to show all the ports individually.

A User-Defined Context, or UDC, is a metric with constant values applied using filter expressions. We often use these to automatically apply a logical grouping that better represents the real world implementation. One ISL aggregate between two switches A1 and A2 tends to encompass all E or TE ports on A1 connected to A2, and conversely, all A2 E or TE ports attached to switch A1.  That tends to make this one ISL unique from others.  We create a UDC in the SNMP/Link scope with values based on the “name of the switch” in an ISL: for example, in “SW12A44:3:1 ISL” as a link name, “SW12A44″ is the switch name.  ISLs between two switches share the same switch names, but are distinct by this same manner from ISLs to other switches. All we need is a UDC with values such as “SW12A44″ where “Attached Port Name MATCHES ^SW12A44:*”, and “SW12B44″ where “Attached Port Name MATCHES ^SW12B44:*”.

An example UDC would look like (Using terminology that’s a bit Brocade-leaning for this UDC because the Administrator favoured Brocade terminology) :

UDC to Group ISL Trunks

As you can see, grouping these ISL connects by “Probe Name”, “Channel”, and “TrunkChannel”, and filtering by “Attached ISL” would summarize traffic on all ISLs by the switches each connects, but aggregating bandwidth of all trunk members between each switch. Grouping by Channel continues to help us keep the directions separate so that a trunk loaded with 95% in one direction and 5% in the other shows “95%” and “5%” rather than “50%”.

You’ll notice, too, that we’ve added short-circuit to mark any non-ISL as a “NoTrunk”, the same as the default value. This avoids running the heavier “MATCHES” expression to evaluate ports that aren’t even ISLs. Your Portal server will thank you.

This logic assumes that all ISLs between two switches are in the same Aggregate; if you have any two switches with more than one distinct aggregation of ISLs, our logic no longer applies. One of our Analysts has seen ISLs grouped into multiple distinct aggregates even though they’re between the same switches, but it wasn’t the case in the discussion sponsoring the work I wanted to share.

Some customers have smaller SANs with a few dozen switches; others exceed 280 switches. This number of switches, and the various ISL possibilities between these, makes writing and maintaining a UDC with over 200 values very difficult and labour-intensive.  Because the user is effectively transferring config information from one format to another, accuracy risks can enter where users are transposing digits, or delayed in echoing the updated config information, or (more often) is simply not informed that any change needs to be echoed or copied. These risks are significant detriments to using this method.

To de-risk this implementation and help you try it out, we’ve created a script to convert a basic list of ISLs with nicknames into a UDC: while the blogging engine doesn’t let me upload this file, your VI Support and Services teams can help you get “ISL2TrunkChannelUDC.awk”, and a version of “awk” to run it. If your report with a Data View of “Table” is saved as a CSV with the AttachedPortName in the 4th column, you would run this script as:

awk -v COL=4 -f ISL2TrunkChannelUDC.awk 'Table-(summary).csv' > ISLAggregation.udc

(resulting UDC tested on version 3.0.2 and version 3.1.0 pre-release, incompatible before version 3.0.0)

I hope this helps you keep watch on your ISL utilization, and show the correct justification for adding ISLs to an aggregation, or balancing traffic to another less-used edge switch.

Keep those nicknames updated, and have a great holiday.

Nov 18

In my work, I find that customers need to continually grab some updated labels and data, and re-import. This is tedious.

Worse, it’s in the Windows world, so by comparison, scripting is in a toddler world (small, doesn’t understand, and has tantrums)

I end up using something like the following, understanding that pre-sharing a public SSH key is safer.

@echo off

plink.exe -l ciscouser1 -pw Secr3tP@ssw0rd 192.168.0.1 "show device-alias database" > cisco1.csv
plink.exe -l ciscouser2 -pw Secr3tP@ssw0rd 192.168.0.2 "show fcalias" > cisco2.csv
plink.exe -l brocadeuser1 -pw Secr3tP@ssw0rd 192.168.0.3 "zonecfg" > brocade1.csv
plink.exe -l brocadeuser2 -pw Secr3tP@ssw0rd 192.168.0.4 "alishow" > brocade2.csv

gawk.exe -f brocade-alishow2wwncsv.awk cisco1.csv cisco2.csv brocade1.csv brocade2.csv > nicknames-by-WWN.csv
gawk.exe -f unique-nicknames.awk nicknames-by-WWN.csv > E:\VirtualWisdomData\DeviceNickname\nicknames.csv

We’ve edited “brocade-alishow2wwncsv.awk” to accommodate broader formats, but I haven’t been able to check it on a wide range of platforms.

Nov 01

I’ve been working quite a bit recently with NTP, so I wanted to record some notes.

I’ve found that when the NTP server is fewer than 4x reached, then it will not choose a source. When it chooses a source, a local source might cause it to flag itself as “unreliable” (leap flag with a bad result). That or a poor stratum (less than client’s fudged 127.127.1.0) is causing a client to ignore a source.

In our case, we saw an SNTP reject on status == 3 (see sntp/main.c::read_packet : “if (failed || data->status == 3″); this “status” is actually the Leap Indicator being 0%11 ( == 3), which was re-instated in RFC-2030 (superseding RFC-958) as an alarm condition (when the last second plus the next second are leap-seconds).

The client is flagging 0×2001 and 0×6001 quite frequently; this is clearly PLL/FLL changeovers, implying that it often sees a “weak” source, perhaps one that jitters too often, and swaps mode.

I’ll add to this over time if I see additional factors. As always — ALWAYS — “ntpq -c opeers” (in whatever form is permitted by your OS) is the best first-step, although “readvar” is also a good first step.

Oct 28

When continuing to build, using automate and autoconf, I ran into this:

checking whether make sets $(MAKE)... yes
checking for gcc... gcc
checking for C compiler default output file name...
configure: error: C compiler cannot create executables
See `config.log' for more details.
make: *** [config.status] Error 77

The config.log shows:


configure:2661: checking for C compiler default output file name
configure:2688: gcc conftest.c >&5
ld: library not found for -lcrt1.10.6.o
collect2: ld returned 1 exit status

There’s a lot of discussion about this, but basically, Apple didn’t check their own tool. Shame on you, Apple.

The fix is simple, embarrassingly so:

sudo ln -s /Developer/SDKs/MacOSX10.6.sdk/usr/lib/crt1.10.6.o /Developer/usr/llvm-gcc-4.2/lib

I would expect that this needs to be updated every release.

Oct 28

Like everyone else, my Xcode install broke the command-line tools I use very often. It seems Apple didn’t feel like testing their command line stuff at all, since it’s glaringly obvious that it fails:


autoreconf: Entering directory `.'
autoreconf: configure.ac: not using Gettext
autoreconf: running: aclocal --force
autom4te: m4sugar/m4sugar.m4: no such file or directory
aclocal: /Developer/usr/bin/autom4te failed with exit status: 1
autoreconf: aclocal failed with exit status: 1

I mean “works” and “fails”, polar opposites, clearly no one checked. Shame on you, Apple.

The fix was simple: (Thanks to Nathan Herring’s consideration of ADL’s post in Stack Exchange )


*** /Developer/usr/share/autoconf/autom4te.cfg 2011-10-28 00:15:15.000000000 -0700
--- /Developer/usr/share/autoconf/autom4te.cfg 2011-10-28 00:14:33.000000000 -0700
***************
*** 99,101 ****
begin-language: "Autoconf-without-aclocal-m4"
! args: --prepend-include /usr/share/autoconf
args: --cache=autom4te.cache
--- 99,101 ----
begin-language: "Autoconf-without-aclocal-m4"
! args: --prepend-include /Developer/usr/share/autoconf
args: --cache=autom4te.cache
***************
*** 126,128 ****
begin-language: "Autotest"
! args: --prepend-include /usr/share/autoconf
args: autotest/autotest.m4f
--- 126,128 ----
begin-language: "Autotest"
! args: --prepend-include /Developer/usr/share/autoconf
args: autotest/autotest.m4f
***************
*** 140,142 ****
begin-language: "M4sh"
! args: --prepend-include /usr/share/autoconf
args: m4sugar/m4sh.m4f
--- 140,142 ----
begin-language: "M4sh"
! args: --prepend-include /Developer/usr/share/autoconf
args: m4sugar/m4sh.m4f
***************
*** 152,154 ****
begin-language: "M4sugar"
! args: --prepend-include /usr/share/autoconf
args: m4sugar/m4sugar.m4f
--- 152,154 ----
begin-language: "M4sugar"
! args: --prepend-include /Developer/usr/share/autoconf
args: m4sugar/m4sugar.m4f

Jun 21

For ~19 years, I’ve pushed international date formats; date formatting is a strange detail that seems to be so significant to me.

Someone (“lion”) whose industry knowledge I strongly respect, and who has an attitude of “what will XXX do for me today?” (ie not flippantly changing direction on things without clear benefit) admitted to switching to international date format.

Why is that so cool to me?

  • it validates my position, but may not yet validate the stubbornness with which I’ve pursued this
  • confirms the benefits are not merely in my biased opinion
  • it’s easier for me personally :) (it’s all about me)

In this case the benefits of ISO-8601 format (the basis for W3C, RFC-3339, used by tools such as XML recommended datatypes) of yyyy-mm-dd being unambiguously understood in other countries, and sorting where least expected (filenames of presentations and revisions of tech pubs) due to its friendliness with sort algorithms no more complex than strcmp(), these benefits have made it easier to find the earliest/latest/newest versions of documents where dozens of similar content may exist.

I still argue that:

  1. Locale is not Language: Just because I’m in China, I may not speak Chinese (and which China? Which Chinese?)
  2. Locale is not format: in Canada, we use date formats dd/mm/yy, dd-mmm-yy, and yymmdd — so which one?
  3. Language is not format: en_CA is similar to en_US, but 2/3/4 is a different date than you think
  4. There are more cities than Timezones: if you’re choosing a timezone, choose a timezone, not a city, unless it helps to suggest the actual timezone
  5. Don’t accidentally get into politcs: Wales and London are not so United of Kingdom; if you don’t offer “Cardiff” and “Bristol” as cities, then allow the user to choose their timezone without claiming to be British (or is that “English”?). If you don’t understand that, spend a week in Wales.

RFC822 and RFC-2822 force foreign countries to know English day-names and Month names (not hard, but entrypoint for error) and require non-trivial parsers, but some developers just don’t seem to consider that there might be an easier way.

Tools that follow java.util.Date or PHP date formats or make assumptions based on /etc/tz values, they’ll continue to limit their users’ choices, further limiting their users’ end-users’ choices. Think about what constraints you’re doing. “Rome wasn’t built in a day”, “you can’t turn a supertanker”, “can’t change China” – you don’t have to try to change the world, just don’t participate in the constraint.

Hey, it’s nice to be right on occasion. Unfortunately, I still need to choose Japan or China as locales in order to get usable dates.