top of page
Search

David McGinnis
Mar 17, 20206 min read
Testing a Hive Patch on a Local System
[...] I needed to get a Hive cluster running my code and a Confluent cluster that could output Avro messages in the proper format to test.
307 views
0 comments


David McGinnis
Oct 22, 20194 min read
Running Garbage Collection on Your Cluster
At a high level, [CGC] is merely going through the cluster, taking inventory of the data and processes that run on the cluster...
108 views
0 comments


David McGinnis
Oct 16, 20196 min read
Writing Environment Agnostic Code
[...] we'll discuss some of the ways we can write environment agnostic code, which can be run on any environment within your enterprise.
1,280 views
0 comments


David McGinnis
Sep 29, 20194 min read
YARN Capacity Scheduler and Node Labels Part 2
How do we ensure that GPU jobs run on worker nodes with GPUs without buying expensive GPUs for all of our worker nodes?
885 views
0 comments


David McGinnis
Sep 22, 20195 min read
YARN Capacity Scheduler and Node Labels Part 1
I'm going to explore exactly how YARN works with queues, and the various mechanisms available to control how YARN does this.
2,465 views
0 comments


David McGinnis
Sep 1, 201910 min read
Machine Learning Solutions: Recommender System Design
With the help of tools like Spark’s MLlib ... [making a recommendation engine] is something that many companies have done and you can too.
106 views
0 comments
bottom of page