Things I Learned Building a Gem

Tech: Ruby, Rubocop, Splunk, Docker, CI, Rspec, TigerConnect

Challenge: a particular component (HIPAA-compliant text messaging) of an application proved valuable in other applications. The code was organically customized and integrated into several applications. Eventually, we had a clear case to pull out the duplicated code (DRY!) from the various repos, standardize, and package as a gem.

What I learned:

  1. send is a Ruby reserved keyword (for dynamically calling method names)
  2. How to pass creds for a private GitLab/GitHub repo into Gemfile, so the gem can be implemented (gitlab-ci-token)
  3. After updating the gem, and using bundle update in an app implementing the gem, sometimes you must force pull from the private repo to get the latest changes.
  4. GitLab CI w/ RSpec, Rubocop!
  5. garbage collection https://blog.codeship.com/visualizing-garbage-collection-ruby-python/
  6. Splunk search is NOT case sensitive!

 

And topics/pitfalls for further research:

  1. Oddly, my Mail settings took in the local environment, but not in the application Docker image utilizing the gem
  2. In the local environment, one dynamic require all in the main module worked nicely for files in ./lib. However, needed multiple require statements (one for each file) when implementing the gem elsewhere.
  3. Ignoring Gemfile.lock in commits was helpful when trying to implement gem in applications.

 

Reference:

Generic Gem Template

Building a Gem Guide

Naming, Versioning, Dependencies, etc.

Bundler – creating gem

GitLab Token

Bug Troubleshooting w/ New Tools

Tech: Ruby, TigerConnect, Splunk, SSH, CSSHX, grep, curl, HTTParty

Challenge: an application is configured to log an event in Splunk after a successful send of a TigerConnect HIPAA-compliant alert message. The entry point is working, however the alerts are not being sent AND the event is not being logged. After successful troubleshooting to determine what’s NOT wrong, I turned to CSSHX and grep to scour the logs. Why CSSHX… our production instances run on 4 servers concurrently. I want to quickly navigate in 1 terminal tab w/ 4 windows and grep the the logs!

Code:

CLI (for 4 instances) => csshx username@dserver.location.extension username@dserver.location.extension username@dserver.location.extension username@dserver.location.extension

 

grep for various strings => grep -A 10 "error" log_file.log

-A # is for number of lines after the string

-B # is for number of lines before

 

I also wanted to check my HTTParty gem code that transmits the event from app to Splunk. I used a curl statement to mimic the HTTParty call.

CLI => curl https://url.extension:####/services/collector -k -H 'content-type: application/json' -H 'authorization: XXXXXX' -d '{"event":{"app": "data"}, "sourcetype": "_json"}'

-k (–insecure) allows for insecure server connections

-H header key, values

 

I was quite proud to implement the use of these tools when troubleshooting. I have used them in different contexts and it was cool to bring everything together to discover the issue. However (and sadly), the issue was much simpler.

Ruby ENV variable are strings. Even if the string is a “boolean”. 

in .env … VAR=false

if ENV['VAR'] == false then <do something> end

=> equates to FALSE… because ENV['VAR'] exists as a string, it is TRUE.

 

Pattern: Iteration within Interation (pt. 2)

Tech: Ruby

Challenge: returning to iterating! Last time, I wanted the sub-iteration to skip records already looked at, as well as skip the index forward. Let’s gooo!

Code:

i = 0

patients.each_with_index do |row, index|

next if i > index # will skip through rows already processed in the while loop

# using while equal makes it a little easier to understand than part 1’s solution

while patients[i][:id] == patients[index][:id]

# i will increase, and as long as it keeps matching our entry index :id; we are within the same patient’s records

# do logic

i += 1

break if !patients[i] # no more records

end

end

 

Takeaway: I previously implemented part 1’s solution, and this solution in production code. I wonder if there is a scenario where part 1 would be optimal comparatively (I think no because rows are redundantly processed).

slider

Undesired rc-slider onAfterChange Event

Tech: javascript, npm rc-slider, react

Challenge: The webapp designer choose to have both the rc-slider’s initial state (null), and the lowest actual value (1), be represented in the lowest/left-most node. If the User simply clicks the left-most node, the null value updates to 1 through the RcSlider onAfterChange API. The problem is whenever the next click occurs… anywhere on the page OR on any element… onAfterChange fires again, which is undesired!

Code:

The rc-slider is wrapped with a custom element. We want the rc-slider onAfterChange event to bubble up, so we pass down our “onClick” function. (Note: these elements & functions have been stripped for ease of explanation.)

onClick function:

onSliderClick = (factorIndex, value) => {

if (value === 0) {

this.props.sliderClickedFromDefault(factorIndex, 0)

}

}

React elements:

parent: <Slider onSliderClick={() => onSliderClick(factorIndex, value)} />

child: <RcSlider onAfterChange={onSliderClick} />

 

The initial click of the rc-slider works great! During troubleshooting, we capture the document.activeElement:

document_active_element_1

 

It is the second click, anywhere on the page OR on any element, that causes the issue. Here is the Redux Inspector Action Event Log:

react_log

 

After reading through known rc-slider issues, and speaking with the Great Google, I have not yet found a solid solution or explanation as to why this is happening. My workaround…

Let’s check the activeElement the second time onAfterChange fires. If it is not my desired element (the slider!), do not dispatch an action!

onSliderClick = (factorIndex, value) => {

const activeClass = document.activeElement.className

if (value === 0 && activeClass === 'rc-slider-handle rc-slider-handle-click-focused') {

this.props.sliderClickedFromDefault(factorIndex, 0)

}

}

 

Takeaway: I do not love this solution and am convinced something else is going on here. Although the workaround is effective, I am writing this post in hopes of coming back to it with a real solution & understanding.

 

References:

https://www.npmjs.com/package/rc-slider

JavaScript: Evolved Thinking

Tech: Javascript in React application

Challenge: refactor an archaic iteration to identify the correct string within an array of objects. A code review by a senior programmer identified a section of code ripe for refactor. JavaScript has some great built-in object methods; let’s fully utilize!

Code:

Here is our array with objects identifying URL paths and additional properties

blog_evolved_thinking_pages

 

And the original iteration to identify the api/database’s value for lastScreenSaved, ex – ‘factor_affect’.

blog_evolved_thinking_original_iteration

What’s wrong?? For starters, the iteration continues after identifying the correct index. Second, there are better tools to identify a substring (which we will see in the refactored code).

 

Let’s take a look at the refactored code:

blog_evolved_thinking_refactored

Wow! Much more compact and clean. Allows for quicker understanding, easier testing, and faster refactoring if URL extension changes in future.

 

Takeaway: I tend to think in basic data structures and iterations from my early coding days. It pays to take a minute prior to merge requests to look for & refactor code to better utilize the tools available in mature languages. 

 

Reference:

https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Array/findIndex

https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/String/search

 

Recovering Data with ActiveRecord, AXLSX Gem

Tech: Rails, SQLite, ActiveRecord, Ruby gems activerecord-import, axlsx, and axlsx_rails

Challenge: a co-worker is facing a data corruption issue. Don’t ask me all the details. The current situation… we have two spreadsheets (one with 985K records, the other with 1.4 million records) and need to identify rows/records that are not present in both spreadsheets.

We initially tackled in Microsoft Access but performance issues became a real nuisance. Access would freeze on file upload, when scrolling through query results, etc.

Additionally, we were seeing funny results. We want to LEFT JOIN on 4 columns, and include those 4 columns in the WHERE clause (column IS NULL). Joining and where-ing on 1 column, then 2 and 3 columns brought back the expected record count. However, joining on all 4 returned curious results — just the records in “table A” of the left join, which didn’t make sense.

Solution: Since working with a DB administrator wasn’t a viable option, I though hard on what tools I could use to improve performance while investigating the issue. I have used the above mentioned tech for database & excel work, and although wary of the size of the excel sheets & corresponding query results, I decided to give it a try. Couldn’t be worse than Access, and it would also help verify our query results.

Code: 

First, load excel records into SQLite w/ ActiveRecord. Fairly easy with activerecord-import. The only issue was the large number of records… but that was easily solved by batching the uploads. In /lib/seeds/, create a seeds.rb file and add:

sean_seeds_import

 

Second, setup the export route in controller with corresponding SQL:

sean_export_route

 

And the axlsx export file in views:

sean_axlsx_export

 

Conclusion: the results look promising but more research is needed. If the 4th column (COMPL_DTE) is included in the ON column, we just see all the records of Table A and not what we want… the records from Table A that have no corresponding record in Table B. Additionally, the export out to excel took a long time! The export from orphaned Table B records, approximately 785K, took 3-4 hours.

 

Reference:

https://support.office.com/en-us/article/compare-two-tables-and-find-records-without-matches-cb20ad48-4eba-402a-b20d-eaf10a5d1cb4

https://stackoverflow.com/questions/6613708/how-can-i-join-two-tables-but-only-return-rows-that-dont-match

https://mattboldt.com/importing-massive-data-into-rails/

 

huey

The Power of… Backups!

Code: Heroku, ruby, activerecord

Challenge: database restoration! During my almost-daily updating of www.embiidfeed.com during the NBA season, I accidentally updated ALL the records instead of a specific record. Whoops. I was instantly crushed, angry, and rueful. But much like this blog, present-Luke is very thankful for past-Luke…. say hello to Heroku backups! Luckily, I had taken advantage of Heroku’s daily backup feature and just a few commands would restore the DB to its previous glory. (BTW, I realize consoling into the production database for updates is a bad idea. I have a backlog item to better automate this with a script & endpoint, but personal projects typically take a slower path.)

First, the infraction:

blog_db_backup_1

 

Let’s take a look at what’s available:

blog_db_backup_2

 

And restore!

blog_db_backup_3

 

Takeaway: although not a DB administrator, I will always take a moment to ensure a backup database plan is discussed & implemented for any project! Too dangerous otherwise. And always… Trust the Process.

Reference:

https://devcenter.heroku.com/articles/heroku-postgres-backups

 

 

turtles

Pattern: Iteration within Interation (pt. 1)

Tech: Ruby

Challenge: need to iterate over a set of patient records…within the set, each patient has several records… so when the initial iteration gets to a new patient, the code needs to include a sub-iteration specific to the patient. For example, a patient’s subset of data includes their location & treatment history. We want to capture the patient’s last location, but their first instance of treatment. The data is extracted into a patient-specific data hash.

Code:

patients.each_with_index do |row, index|

i = index
until patients[i]['ID'] != patients[index]['ID']

update_patient_hash(row['ID'], data)

i += 1
break if !patients[i] # no more records

end

end

 

Takeaway: this could benefit from refactoring…

  1. The sub-iteration repeats the processing of records due to reference to the row for data capture, instead of patient[i][data]
  2. Could the initial iteration’s index be reset/moved to the new patient once the sub-iteration is finished?

Ruby Template => Data to JSON

Tech: Ruby, JSON, Docker

Challenge: data for daily reporting is captured in a class instance variable hash. If/when the docker container restarts, I want the data, which is regularly written to a JSON file, to be reloaded into the @hash.

Code:

def initialize

if FileTest.exists?('./data.json')

@instance_var_hash = JSON.parse(File.read('./data.json'))

else

@instance_var_hash = { ... }

end

end

Potential Pitfall: when sandboxing, I had an instance variable hash with symbol keys. When reading in the JSON file and writing to my instance variable, I generated an error in my methods… :key in original instance variable versus 'key' from incoming JSON.
 
Does the JSON file exist?

if FileTest.exists?('./file_name.json')

Read file

JSON.parse(File.read('./file_name.json'))

Open the file & write

File.open('./temp.json', 'w') do |f|

f.write(JSON.pretty_generate(object))

end

 
Reference:

https://github.com/lukekedz/ruby_json_template

http://ruby-doc.org/stdlib-1.9.3/libdoc/json/rdoc/JSON.html

http://ruby-doc.org/core-2.2.0/FileTest.html

https://stackoverflow.com/questions/5507512/how-to-write-to-a-json-file-in-the-correct-format/5507535

https://hackhands.com/ruby-read-json-file-hash/