BigQuery migrations made easy
October 8, 2021Using the Cloud Spanner Emulator in CI/CD pipelines
October 8, 2021Developers & Practitioners
The Workflows team recently announced the general availability of iteration syntax and connectors!
Iteration syntax supports easier creation and better readability of workflows that process many items. You can use a for loop to iterate through a collection of data in a list or map, and keep track of the current index. If you have a specific range of numeric values to iterate through, you can also use range-based iteration.
Connectors have been in preview since January. Think of connectors like client libraries for workflows to use other services. They handle authentication, request formats, retries, and waiting for long-running operations to complete. Check out our previous blog post for more details on connectors. Since January, the number of available connectors has increased from 5 to 20.
The combination of iteration syntax and connectors enables you to implement robust batch processing use cases. Let’s take a look at a concrete sample. In this example, you will create a workflow to analyze sentiments of the latest tweets for a Twitter handle. You will be using the Cloud Natural Language API connector and iteration syntax.
APIs for Twitter sentiment analysis
The workflow will use the Twitter API and Natural Language API. Let’s take a closer look at them.
Twitter API
To use the Twitter API, you’ll need a developer account. Once you have the account, you need to create an app and get a bearer token to use in your API calls. Twitter has an API to search for Tweets.
Here’s an example to get 100 Tweets from the @GoogleCloudTech handle using the Twitter search API:
BEARER_TOKEN=...
TWITTER_HANDLE=GoogleCloudTech
MAX_RESULTS=100
curl -X GET -H "Authorization: Bearer $BEARER_TOKEN" "https://api.twitter.com/2/tweets/search/recent?query=from:$TWITTER_HANDLE&max_results=$MAX_RESULTS"
Natural Language API
Natural Language API uses machine learning to reveal the structure and meaning of text. It has methods such as sentiment analysis, entity analysis, syntactic analysis, and more. In this example, you will use sentiment analysis. Sentiment analysis inspects the given text and identifies the prevailing emotional attitude within the text, especially to characterize a writer’s attitude as positive, negative, or neutral.
You can see a sample sentiment analysis response here. You will use the score
of documentSentiment
to identify the sentiment of each post. Scores range between -1.0 (negative) and 1.0 (positive) and correspond to the overall emotional leaning of the text. You will also calculate the average and minimum sentiment score of all processed tweets.
Define the workflow
Let’s start building the workflow in a workflow.yaml
file.
In the init
step, read the bearer token, Twitter handle, and max results for the Twitter API as runtime arguments. Also initialize some sentiment analysis related variables:
main:
params: [args]
steps:
- init:
assign:
- bearerToken: ${args.bearerToken}
- twitterHandle: ${args.twitterHandle}
- maxResults: ${args.maxResults}
- totalSentimentScore: 0
- minSentimentScore: 1
- minSentimentIndex: -1
In the searchTweets
step, fetch tweets using the Twitter API:
- searchTweets:
call: http.get
args:
url: ${"https://api.twitter.com/2/tweets/search/recent?query=from:" + twitterHandle + "&max_results=" + maxResults}
headers:
Authorization: ${"Bearer " + bearerToken}
result: searchTweetsResult
In the processPosts
step, analyze each tweet and keep track of the sentiment scores. Notice how each tweet is analyzed using the new for-in
iteration syntax with its access to the current index
.
- processPosts:
for:
value: tweet
index: tweetIndex
in: ${searchTweetsResult.body.data}
Under the processPosts
step, there are multiple substeps. The analyzeSentiment
step uses the Language API connector to analyze the text of a tweet and the next two steps calculate the total sentiment and keep track of the minimum sentiment score and index:
steps:
- analyzeSentiment:
call: googleapis.language.v1.documents.analyzeSentiment
args:
body:
document:
content: ${tweet.text}
type: "PLAIN_TEXT"
result: sentimentResult
- updateTotalSentimentScore:
assign:
- currentScore: ${sentimentResult.documentSentiment.score}
- totalSentimentScore: ${totalSentimentScore + currentScore}
- updateMinSentiment:
switch:
- condition: ${currentScore < minSentimentScore}
steps:
- assignMinSentiment:
assign:
- minSentimentScore: ${currentScore}
- minSentimentIndex: ${tweetIndex}
Once outside the processPosts
step, calculate the average sentiment score, and then log and return the results
- assignResult:
assign:
- numberOfTweets: ${len(searchTweetsResult.body.data)}
- averageSentiment: ${totalSentimentScore / numberOfTweets}
- logResult:
call: sys.log
args:
text: ${"N:" + string(numberOfTweets) + " tweets with average sentiment:" + string(averageSentiment) + " min sentiment:" + string(minSentimentScore) + " at index:" + string(minSentimentIndex)}
- returnResult:
return:
numberOfTweets: ${numberOfTweets}
totalSentimentScore: ${totalSentimentScore}
averageSentiment: ${averageSentiment}
minSentimentScore: ${minSentimentScore}
minSentimentIndex: ${minSentimentIndex}
Deploy and execute the workflow
To try out the workflow, let’s deploy and execute it.
Deploy the workflow:
gcloud workflows deploy twitter-sentiment
--source=workflow.yaml
Execute the workflow (don’t forget to pass in your own bearer token):
gcloud workflows execute twitter-sentiment
--data='{"bearerToken":"<your_token_here>", "twitterHandle":"GoogleCloudTech","maxResults":"100"}'
After a minute or so, you should see the see the result with sentiment scores:
gcloud workflows executions describe bcf52313-4ce9-4c4f-9b5e-2f461223923f twitter-sentiment
...
result: '{"averageSentiment":0.27076923,"minSentimentIndex":57,"minSentimentScore":-0.2,"numberOfTweets":65,"totalSentimentScore":17.5999}'
state: SUCCEEDED
Next
Thanks to the iteration syntax and connectors, we were able to read and analyze Tweets in an intuitive and robust workflow with no code. Please reach out to @meteatamel and krisabraun@ for questions and feedback.