Examples of Blogging I did for Evident Software



WebLogic Optimization Tip #1 – Understand the database tier

WebLogic optimization is a rich area for possibilities due to the numerous technologies and monitoring tools involved. One crucial area is the database; this post will talk about RDBMS, with subsquent posts in this series addressing NoSQL and data caching.

  1. JNDI lookups are relatively expensive, so use caching for any objects that initialize database connections (or ANY connections for that matter, such as JMX connections that might be done for server performance monitoring).
  2. Use resource pooling for connections to the database from WebLogic. The WebLogic JDBC connection pooling feature allows this to be configured relatively easily.
  3. The Prepared Statement Cache should be optimized and used. This cache keeps compiled SQL statements in memory, thus avoiding a round-trip to the database when the same statement is used later.
  4. Dive deeper into that realm, and ensure that whenever possible, stored procedures, packages and similar program constructs are used rather than SQL statements.
  5. Let the RDBMS server do the heavy lifting; that’s what it’s designed to do: serve.
  6. Understand the set orientation of SQL (which abhors unnecessary use of cursors).
  7. Understand that other applications may also be accessing that database, and that the root cause of sudden poor performance may be some other business entity hitting that server. Try to ensure your database administrators keep you in the loop in such matters.

These are just some of the considerations that take us out of the application tier and into the database tier, but which may result in perceived poor application performance.

NoSQL DB basics for the RDBMS-savvy

What does a traditional RDBMS programmer or architect need to understand to be productive with NoSQL (Not-only SQL technologies) and DCP (data caching platforms)?

I asked this question of our development team.

Here’s their list of things to know:

  1. Understand how ACID compares with BASE (Basically Available, Soft-state, Eventually Consistent)
  2. Understand persistence vs non-persistence, i.e., some NoSQL technologies are entirely in-memory data stores
  3. Recognize there are entirely different data models from traditional normalized tabular formats: Columnar (Cassandra) vs key/value (Memcached) vs document-oriented (CouchDB) vs graph oriented (Neo4j)
  4. Be ready to deal with no standard interface like JDBC/ODBC or standarized query language like SQL; every NoSQL tool has a different interface
  5. Architects: rewire your brain to the fact that web-scale/large-scale NoSQL systems are distributed across dozens to hundreds of servers and networks as opposed to a shared database system
  6. Get used to the possibly uncomfortable realization that you won’t know where data lives (most of the time)
  7. Get used to the fact that data may not always be consistent; ‘eventually consistent’ is one of the key elements of the BASE model (I see this latency issue all the time in Twitter, in ‘Followers’ list)
  8. Get used to the fact that data may not always be available
  9. Understand that some solutions are partition-tolerant and some are not

These attributes vary from one system to another. It’s as important to understand the differences among NoSQL technologies as it is important to understand how they differ from a traditional RDBMS.

Here is a pretty good list of the many NoSQL products, from a respected member of the community, Alex Popescu.

Learn more about our performance monitoring solution for Java, NoSQL and web servers.

Apache Cassandra Notes

Apache Cassandra 0.7 was announced recently, and downloads of the powerful and well-vetted distributed data maintenance solutions proceed apace. (Hmm, Apache, apace…synchronicity, or neighbo[u]ring memory neurons connecting through semantic patterns?)

One of the stars in the NoSQL firmament, Apache Cassandra is busily working behind the scenes worldwide to promote scalability and uptime for big names in the business and social media world.

We at Evident Software offer Application Performance Management tools for Cassandra and other NoSQL technologies. Our interest in this Apache project is two-fold however; we are also incorporating Cassandra 0.7 into Evident ClearStone v5.0, which is going to beta on Feb 14th. Won’t you be our beta tester Valentine?

NoSQL DB Performance Comparison – Tips and Techniques from the Experts

One approach to NoSQL DB performance comparison is to do time-on-time comparisons of single or multiple metrics with charts representing different monitored resources. Don Jeffery puts it this way: “Depending on what your objective is when you are benchmarking, one of the things that you might do is to take a quiescent system, one that’s spun up but not incurring any load, and introduce some load to it over a period of time. For example, in Coherence, you establish the cluster, introduce, say, eight nodes, maybe eight JVMs over four loads, with caches introduced, but no clients.”

“You would monitor this quiescent system and inspect certain key metrics and expect them to be relatively flat. You’d then introduce loads and keep track of the times and loads introduced.” (Since we store history as time series data in Cassandra, getting this information for potentially long stretches of time is not an issue.)

Don continues: “If you keep track of the time and the load, then you can introduce time-on-time comparisons where you look at a number of key metrics. So now I have a benchmark; what I might want to do is vary that load in a certain way over a different time period, and perhaps compare against the initial benchmark and the previous set of measurements, to see if I can establish any kind of a trend. For example, I’ve attempted to increase the GET burden against the cache by 50%; does that translate into an equivalent translation of certain key metrics, or is there no linear relationship?”

We are about to introduce time on time measurements and thereby support this type of analysis, with some of the visualizatons we are planning in future releases of ClearStone. Multiple perspectives in the ClearStone real time dashboard, shared and customized, already support the presentation of charts from various resources in a free-form and easy to create manner.

Performance comparisons of different NoSQL DB technologies may be possible, Don said, if they are similar, say, two data grid technologies, with a deliberate and documented use of load and benchmarks. “You might look at response times, for example, or GETs, or look at cache hits.” This could enable a useful technology choice in situations where you know what kind, what size and what elasticity you anticipate in your data.

“Take the example of ehcache performance. The first consideration is to define what the key metrics are. One way to do that even with our system today is to establish reasonable thresholds and to use our thresholding policy tools to set those up and capture them and create events. That way we see if we violate any of those thresholds. We start with a quiescent system and introduce a load over an hour; during that time, ECS is running, so our collectors are doing their job. Line charts we assemble are being annotated with the events we’ve defined at the points of transition to the threshold.”

“We can do this with our product today, but not as conveniently as with 5.0 where we could use a single chart, we can do another test run and vary the load. For a data caching product, for example, we introduce additional caches or perhaps retune the network or cluster and start it up again with larger or smaller Java heap sizes or any number of other parameters we want to investigate; then we look at the behavior; then we run another load test; or perhaps same load, but vary the heap sizes. What we can do is to create a perspective in the real time dashboard; create two charts and compare them by using a metric that correlates the two time periods. What we hope to do soon is to have this comparison appear in the same chart. While we can do a better job of integrating the visualization, we can support those use cases today with two separate charts that are visually aligned in such as way as to allow comparison.”

Learn more about our performance monitoring solution for Java, NoSQL and web servers.

Embedding Cassandra within Tomcat for testing

Evident ClearStone 5.0 uses Cassandra for its persistent storage of application performance monitoring data, as written about elsewhere in our blog. There are some cases where you might want to embed Cassandra within your Java application or run it as a webapp within your application server for testing purposes. Ching-Cheng Chen, longtime stalwart of the company, sat with me recently to help me with this blog post, furnishing all the detail. Let me start by sharing his strong suggestion that embedding Cassandra within Tomcat is not a good idea for production.

Here are some notes about embedding Cassandra within Tomcat:

Take a look at the Cassandra (for 0.7.0 release) start up script; it will give you clues on how to start up/shut down Cassandra within your own Java class.

To start up a Cassandra node, instantiate an org.apache.cassandra.thrift.CassandraDaemon object and invoke its activate() method in a separate thread.

To shut down the Cassandra node, invoke the deactivate() method on the CassandraDaemon object you created during start up.

The default Cassandra thread pool worked fine for us in our testing mode. Everyone should modify their Cassandra thread pool configuration according to their environment, though.

But what about the configuration? Cassandra will load the configuration file cassandra.yaml if found in the classpath by default. If there is a need to use different configuration base on environment, you can always generate a configuration file on the fly. To generate Cassandra configuration file dynamically, instantiate an org.apache.cassandra.config.Config object and populate it with your preferred configuration, then write the configuration using the snakeyaml API included in Cassandra distribution.

import org.apache.cassandra.config.Config;
import org.apache.cassandra.utils.SkipNullRepresenter;
import org.yaml.snakeyaml.Dumper;
import org.yaml.snakeyaml.DumperOptions;
import org.yaml.snakeyaml.Yaml;
import org.yaml.snakeyaml.nodes.Tag;

// create configuation object
Config config = new Config();
config.cluster_name = "ClusterName";
config.rpc_port = 9160;

// write configuration file
FileWriter fw = new FileWriter(new File(classPathDir+”/cassandra.yaml”));

DumperOptions options =DumperOptions();
options.setDefaultFlowStyle(DumperOptions.FlowStyle.BLOCK);
SkipNullRepresenter representer = newSkipNullRepresenter();
representer.addClassTag(Config.class, Tag.MAP);
Dumper dumper = new Dumper(representer, options);
Yaml yaml = new Yaml(dumper);
yaml.dump(config, fw);
fw.close();

I close with this reminder: Do NOT use this for production; your application(s) will compete for resources with Cassandra; that is certainly not a good thing. For the simpler testing environment, though, it can be useful.

Impetus can provide just that

I recently was invited to a webinar on Hadoop and NoSQL by Impetus, a self-styled “Big Data Services” company. Let me start off with compliments to the Impetus team for their very professional delivery. It was evident that care had been taken in preparation, the timing and pacing were spot on, and all speakers handled with aplomb the several handoffs that were interspersed with poll questions in the presentation. Sanjay Sharma, Technical Architect, and Gaurav Nigam, Module Lead, were the main speakers.

A video of the webinar will be available in a few days, but for now, some preliminary thoughts, and what I as a novice to Hadoop (and just one step removed from that status as a NoSQL guy) thought were highlights.

The moderator started by describing Hadoop and NoSQL as ‘two game-changing technologies’. Hadoop is a framework, a set of MapReduce APIs on top of Java. As such, not being a radically new technology, Hadoop is not difficult for a developer to learn. Hadoop works in batch mode, something that Impetus speakers emphasized, as it can impact how some activities, such as as unit testing, are to be approached.  One tip: unit test of mrjobs should be used.

The ease of transitioning to Hadoop was to crop up at several other points, good news for harried IT shops with ‘performance pressure’; for example, repurposing business logic was used as an example of one easy migration vector. Of course, some learning is required, and that gets into non-tech areas, such as the cost of doing this. Useful guidelines for identifying which project lends itself to the new game changers can be a challenge, one place where Impetus comes in; Impetus offers services such as a deployment toolkit for Hadoop.

One question that Impetus suggested project owners ask themselves should be: is the app compute intensive, leading more toward a choice of, say, Erlang, or data intensive, where Hadoop enters the arena.

NoSQL, as most readers of the Evident blogs probably know, is an appropriately ‘elastic’ label stretched over a number of products/projects that fall into four categories: ColumnFamily, Graph, Key-Value, such as Memcached and Document. Our ClearStone v5.0 product uses, for example, the Neo4j graph NoSQL product, and ColumnFamily champ Cassandra. The NoSQL world is characterized by high availability and amazing scalability, with interesting tradeoffs, such as ‘eventually consistent’ data, as the technology is not transactional.

One point emphasized: for most shops, the traditional development approach can be used for Hadoop/NoSQL life cycle. As long as stakeholders understand MapReduce, there should be a smooth transition.

Impetus recommended verification with a Proof of Concept (PoC), and offered free PoC’s to a ‘select few’, based on submissions from attendees about their Big Data needs, a nice move, methinks.

Impetus claims “a strong focus and established thought leadership in the area of Big Data analytics and high performance computing” and offers a well-tested Global Delivery Model to help you evaluate and implement solutions tailored to your specific technical and business context.

JConsole Alternative – How to Get More Metrics and Better Analytics

While venerable and valuable, JConsole does not offer historical metrics and trending; the field is open to a JConsole alternative. Evident Software is proud to offer such an alternative: its ClearStone application and server performance monitoring suite. Developers are no doubt familiar with JConsole, a tool first introduced with Java 5.0. Part of the JDK, JConsole is built with the Java Management Extensions (JMX) APIs of the java.lang.management API. Tapping directly into the internals of the JVM, this utility provides a wrapper around the JMX MBeans in any local or remote platform MBeanServer.

I recently had the opportunity to shoehorn an interview into the very busy schedule of Evident’s sprint project manager Tim Sneed about the ClearStone Management Pack for Java, the Evident JConsole alternative. Some tidbits:

The collection configuration utility in the administration console allows one to browse and connect to an MBean server, much like one can do with the standard JConsole. One salient difference, however—a difference that may be crucial for developers—is that charting custom MBeans is possible in ClearStone, whereas in JConsole it is not. Developers rightly are concerned with understanding the dynamics of an elastic and distributed ecosystem as much as possible; the advent and rapid rollout of Big Data technology puts a premium on being able to see and integrate metrics on every IT asset of significance (from back-end data stores to web request response times) within one monitoring platform.

With ClearStone, developers are able to see important metrics by exposing them through their own custom MBeans. Doing so provides developers with macro-to-micro visibility of their entire deployment stack to see how their application performs over time; who would ever turn down a greater and richer volume of wide-ranging metrics like this?

In addition to just passive capture, ClearStone is a worthy JConsole alternative in the fact that threshold detection and notification are available. For example, assume that you have been performing Jconsole heap dumps and Jconsole thread dumps. You constrain your app to never exceed 300 MB of heap. If that threshold is ever breached, an automatic email can be sent or an event logged (visible to ClearStone’s Event Viewer) to make it clear that some code has to be adjusted.

Another shortcoming of JConsole is the lack of historical perspective. Once the user exits JConsole, or if the target server is restarted for some unknown reason, all the previous stats are gone! imagine running JConsole for several days, then making that innocent mistake. Evident’s JConsole alternative offers history in addition to realtime information.

And how many instances of JConsole are sometimes needed to make proper sense of performance /stability/consistency in a testing environment? As native functionality ClearStone’s JConsole alternative offers the ability to create perspectives, with each perspective containing multiple charts.

ClearStone provides the perfect transition from ‘Dev’ to ‘Ops’ with its functionality and pricing models, making this the best ‘DevOps’ tool for monitoring your deployments.

In addition to its JMX-based Flex Management Pack for Java, Evident offers Management Packs for Oracle Coherence, Memcached, Cassandra, JBoss, and WebLogic. The new version, 5.0, soon to undergo Beta testing (see below for a sign-up link) will also support easy instrumenting via the RESTful Evident ODI (Open Data Interface); sample file formats are CSV and XML. The ODI option allows not only streaming of data but, in the future, support for synthetic event injection, which will empower the developer with incredible levels of details to encourage more informed decision-making.

Cassandra performance – impact of embedding Cassandra within Tomcat

Recently, I posted a tip from one of our veterans about embedding Cassandra within Tomcat for testing, including a suggestion to not do so in production, mainly for performance reasons.

This post generated a comment from a reader named Morten:

“I have actually been considering using Cassandra embedded with Tomcat 7 to run a full stack in the same process, so I am real interested in knowing more about the reason for the strong discouragement of not doing this? Are there any tests, profiling results, or similar behind this?”

I passed the question to Ching-Cheng Chen, and here is the information he furnished:

It probably highly depends on what your application is doing. If your application is very light and uses very few resources, then you might be able to live with that. However, if your application is also performing some heavy logic especially the type of logic that causes lots of heap usage, you really want to avoid embedded Cassandra in your application.

We don’t have any profiling to specifically prove this; however, we have been seeing way different GC behaviors. Our application logic creates many “temporary” objects, with a life cycle long enough to survive a few minor GCs.

If you check Cassandra’s default GC setting, it sets MaxTenuringThreshold as 1. The idea behind that setting is probably that everything that survives one minor GC most likely is meant for cache, so promote them into old gen ASAP. (I pressed Ching-Cheng for clarification of ‘old gen’, and he responded: “It’s a Java GC term; heap memory for Java is kind of divided into multiple ‘regions’: old gen/tenured, eden space, young gen, survivor space, etc. etc. Newly created data is generally put in young gen first and that data is candidate for minor GC. Long lived data will promote to old gen/tenured eventually and no longer will be processed during minor GC but will be checked during major GC. http://www.oracle.com/technetwork/java/gc-tuning-5-138395.html“)

Anyway, this is how you can see how the GC setting for Cassandra and the application has a conflict. He observed that if we set MaxTenuringThreshold to 1 (as Cassandra’s default setting), the old gen fills up too fast and we constantly have to stop-the-world GC for a few seconds.

If we set the MaxTenuingThreshold higher, then we see way more data being copied to survivor space and more time spent on minor GC.

We then tried to decouple Cassandra with 2 JVMs; the JVM running Cassandra used the default Cassandra GC setting and the application JVM didn’t set the MaxTenuringThreshold (I think the default is 31 or 32?). Neither JVM had any stop-the-world GC anymore.

The bottom line is that if you are satisfied with the performance after you embed Cassandra, then go for it, but most likely you will get better performance by decoupling Cassandra from your application.

Inside Evident ClearStone 5.0 – Cassandra, Neo4j and ODI

ClearStone 5.0 is now in GA, with compelling new functionality and a long, clear growth path due to good architecture choices. The main architectural change, with significant implications: ClearStone 5.0 was re-architected with NoSQL DBs Cassandra and Neo4J.

There were several challenges we faced in re-architecting ClearStone, and we’re proud to have handled them in stride.

  1. We were accustomed to using an RDBMS for storage of application and server performance metrics. In an RDBMS, triggers are a bulletproof way of sensing change to data and having the option to take some action, such as sending notifications. While other NoSQL technologies offer an equivalent to triggers, Cassandra does not, so we had to come up with a way to emulate this functionality. (We and others have requested this feature from the Apache Cassandra project, so it may show up in the future). We changed ClearStone’s data model to accommodate this new reality.
  2. Neo4j presented a fascinating challenge. We store and model data as a series of connected nodes; Neo4j, with its very low impedance with the way we designed, made it look like it would be fairly easy. What is challenging is the wide open nature of Neo4j qua implementation of a graph db; you can define nodes and edges (relationships) anyway you want; they can have or not have properties, and can have different types of directionality. So the challenge was to figure out how we wanted to represent all the data without fully knowing in advance how we wanted to traverse all that and get correlations; that presented a challenge from the engineering standpoint of the model.
  3. Another design challenge: we are consuming arbitrary time series data structures. ODI (Open Data Interface), our RESTful API for instrumenting most any IT resource, opens up the product to accept any data from any IT resource; because of this open API, we can’t know in advance what data model will characterize data being instrumented, so we needed maximum possible flexibility. An RDBMS would not only require a formal schema but an attendant requirement to normalize everything, or at least start with that.  Cassandra combines that needed flexibility with high accessibility; Cassandra’s column family implementation is “very forgiving”, as Ivan Ho, Evident’s CTO has said; columns can be added on the fly. (Another feature of Cassandra was especially persuasive: in prior versions of ClearStone, due to its use of RDBMS, there was a discontinuity of the data; a different application was used for history. Cassandra solved that problem, allowing display of current and historical data.)
  4. Cassandra has no query language; all access is through its API; (Lucene is used to query Neo4j, in contrast, one of several methods of retrieval). Using a globally unique key assigned during its inception in Neo4j, we can retrieve needed Cassandra metric, event and entity information.

Future versions will benefit from the flexibility we architected into the product with these NoSQL DBs, as will our customers; expect to see additional Management Packs over the next few weeks; when we ‘sprint’, we take it literally :)

Learn more about our performance monitoring solution for Java, NoSQL and web servers.

Cloud Performance Monitoring on Private, Public and Hybrid Clouds

Cloud performance monitoring is a big challenge for enterprises. There are several reasons for this, such as:

  1. On public clouds, the cloud performance monitoring is only supported for the infrastructure, and is therefore not comprehensive.
  2. On private clouds, all the cloud performance management previously done by the vendors – mainly cloud server monitoring – must now be assumed by the organization, which can mean instrumenting and gathering metrics from scratch. In addition to this burden on developers, all the performance issues of latency, uptime, licensing, provisioning and more become enterprise responsibility.
  3. The correlation of performance monitoring metrics and events between distributed resources in multiple tiers of the infrastructure is a major challenge and not currently available in a single console. Root cause analysis is hampered by the heterogenous, dynamic and elastic nature of the technologies that make the cloud so useful
  4. Cloud technologies such as distributed caches, grids and in-memory stores are relatively new; the interactions among them when they share physical or virtual resources are not yet well understood; how does a Cassandra node coexist with Coherence on the same server, for example? Are any products written with the assumption that they have sole rights to the machine on which they run? Do they ‘play nicely’ with others?

Another important point to consider is unified monitoring and performance management between private and public clouds. One of the main motivations for moving from public to private is performance. For example, a MapReduce job may not get the priority customers want. In a public cloud customers do not control prioritization, and contend with potentially thousands of other users for a time window; consider how that can be a pain point if that MapReduce job is business critical, not to mention security critical as a recent partner conversation disclosed. Also, parts of the supply chain may reside in the cloud, and the uneasiness that such exposure engenders may lead to a desire to go ‘private’. Those in charge of disaster recovery would also lobby in favor of complete integration of that important activity within the walls.

Another use case in which unified monitoring comes in useful is the hybrid cloud, a combination of both investments and models; this introduces likely further complexity. The hybrid cloud makes sense in a lot of scenarios: upskilling staff in preparation for a move to an entirely private configuration, (hoped for) cost reduction, better control over vital resources and more. Issues of hybrid cloud monitoring relate to a possible proliferation of monitoring approaches and solutions, some done in-house, some bought off the shelf or outsourced; this can be messy.

Unified cloud monitoring would enable an enterprise to accurately measure and benchmark performance on private and public clouds in the same way, using the same instrumentation and preferably, even on the same monitoring server. To my knowledge, support for unified monitoring in the existing cloud monitoring tools is patchy at best. Evident Software is working on addressing this need and improving our product’s cloud readiness, to allow transparent monitoring of cloud-based systems, whether they are public, private or hybrid.

Learn more about our performance monitoring solution for Java, NoSQL and web servers

Also see the live blog posts I did for Evident at GigaOm in 2011.