Purchasing gear lubricants: be careful when playing the numbers game
Written by John Sander
Important Numbers
Assuming that steps 1-10 have been considered and the final decision comes down to a comparison of data sheets, the question still remains, what numbers should one consider important? A review of various gear lubricant suppliers' data sheets will show that there can be dramatic differences between the claims made. Without knowledge, the tendency might be to go with the product with the most numbers and OEM claims on the data sheet. While this shows that the supplier was willing to put a sizeable investment into product development testing, it still doesn't necessarily prove that one product is better than the other for the application. Be wary of the lubricant sales person who just points out one specific data point and emphasizes this for the sale. There are other factors that affect the significance of those numbers, such as applicability to the application, test precision and units portrayed. Let's take a look at just a few.
Many companies will show the Timken test – ASTM D2782. What is not widely known outside of the laboratory is the precision of this test. Most ASTM test methods include a repeatability and reproducibility statement. Repeatability is a measure of error between multiple test runs, on the same sample, by the same operator running the same instrument, while reproducibility is the error between multiple test runs conducted on the same sample by different operators on different instruments.
For the Timken test, the repeatability is 30% of the mean on one of 20 samples and the reproducibility is 75% of the mean on one of 20 samples. This means that the error from the same lab on 20 runs will likely produce at least one run with up to 30% error. The rest of the runs may be somewhat better than that, but expect the possibility of at least 30% error. When different labs are running this test, it is even worse. Imagine a sample that has a 50-lb Timken result. If the same lab runs this test, at least one test in 20 will deviate by as much as 15 lb. Within the precision of this test, the same operator could produce a 35-lb Timken result or a 65-lb Timken result, and statistically these would be the same result. Now, seeing that the reproducibility is 75%, two different labs could produce a 12.5-lb result or 87.5-lb result, and these would be considered statistically the same result for the average of this imaginary 50-lb sample.
Next, it is a good idea to pay close attention to the units reported on a product data sheet. Once again using the fictitious 50-lb Timken result, a lab in the U.S. might report that data as 50 lb, while a lab in Europe might present it in metric units as 22.6 Kg. Both of these are correct, but the U.S. lab number looks much higher. One might mistake the U.S. result as a better result when they are actually the exact same number. As a sideline to this, there are various standards groups active throughout the world. They might publish very similar methods, yet there can be subtle differences. One cannot just assume that results published using the same instrument are comparable, because the various methods used can cause differences in results. For example, an ASTM method might produce different results than an ISO method on the same instrument.
There are two common corrosion tests used within the lubricant industry: ASTM D665 and ASTM D130. Both have nuances, depending upon the application. The D665 has an A and B version of the test method. The A version uses deionized water for the testing, while the B version uses standard saltwater. Traditionally, the saltwater version is more severe, so when considering results one should ensure that the same test conditions were employed. The D130 test employs two different test temperatures. This can make a big difference depending upon the EP package used in the gear oil's formulation. The same result published on two competitive data sheets might not mean the same thing if the temperature is not published.
The gear lubricant features that should be evaluated when comparing data sheets depends upon the application. See Table 4 for a list of common lubricant features cross-referenced with application conditions and optimal test results indicating a lubricant's suitability for that application.
While the lubricant industry is considered a relatively mature industry, there are still areas of active research. The leading edge for lubricant manufacturers is to formulate products that can be used in the challenging wind turbine gearbox applications. Over the years, wind turbine OEMs have found that their gear sets are notorious for micropitting, also sometimes called fatigue scoring, flecking, frosting, glazing, gray staining, microspalling, peeling or superficial spalling. Erichello describes it like this:
"Micropitting is surface fatigue occurring in Hertzian contacts, caused by cyclic contact stresses and plastic flow on the asperity scale that results in micro-cracking, formation of micropits and loss of material."
The FVA 54 test evaluates this phenomenon. This test, which is specific to base fluid, viscosity and additive chemistry, is not easy to pass. This is why some wind turbine OEMs have come to respect the data from this test and have now incorporated it into their specifications. Several OEMs also require a passing result for their general industrial gear specifications.