PythonClub – A Brazilian collaborative blog about Python: Algoritmos de Ordenação

Fala pessoal, tudo bom?

Nos vídeos abaixo, vamos aprender como implementar alguns dos algoritmos de ordenação usando Python.

Bubble Sort

Como o algoritmo funciona: Como implementar o algoritmo usando Python: https://www.youtube.com/watch?v=Doy64STkwlI.

Como implementar o algoritmo usando Python: https://www.youtube.com/watch?v=B0DFF0fE4rk.

Código do algoritmo

def sort(array):      for final in range(len(array), 0, -1):         exchanging = False          for current in range(0, final - 1):             if array[current] > array[current + 1]:                 array[current + 1], array[current] = array[current], array[current + 1]                 exchanging = True          if not exchanging:             break 

Selection Sort

Como o algoritmo funciona: Como implementar o algoritmo usando Python: https://www.youtube.com/watch?v=PLvo_Yb_myrNBhIdq8qqtNSDFtnBfsKL2r.

Como implementar o algoritmo usando Python: https://www.youtube.com/watch?v=0ORfCwwhF_I.

Código do algoritmo

def sort(array):     for index in range(0, len(array)):         min_index = index          for right in range(index + 1, len(array)):             if array[right] < array[min_index]:                 min_index = right          array[index], array[min_index] = array[min_index], array[index] 

Insertion Sort

Como o algoritmo funciona: Como implementar o algoritmo usando Python: https://www.youtube.com/watch?v=O_E-Lj5HuRU.

Como implementar o algoritmo usando Python: https://www.youtube.com/watch?v=Sy_Z1pqMgko.

Código do algoritmo

def sort(array):     for p in range(0, len(array)):         current_element = array[p]          while p > 0 and array[p - 1] > current_element:             array[p] = array[p - 1]             p -= 1          array[p] = current_element 

Planet Python

Vladimir Iakolev: Analysing the trip to South America with a bit of image recognition

Back in September, I had a three weeks trip to South America. While planning the trip I was using sort of data mining to select the most optimal flights and it worked well. To continue following the data-driven approach (more buzzwords), I’ve decided to analyze the data I’ve collected during the trip.

Unfortunately, I was traveling without local sim-card and almost without internet, I can’t use Google Location History as in the fun research about the commute. But at least I have tweets and a lot of photos.

At first, I’ve reused old code(more internal linking) and extracted information about flights from tweets:

all_tweets = pd.DataFrame(     [(tweet.text, tweet.created_at) for tweet in get_tweets()],  # get_tweets available in the gist     columns=['text', 'created_at'])  tweets_in_dates = all_tweets[     (all_tweets.created_at > datetime(2018, 9, 8)) & (all_tweets.created_at < datetime(2018, 9, 30))]  flights_tweets = tweets_in_dates[tweets_in_dates.text.str.upper() == tweets_in_dates.text]  flights = flights_tweets.assign(start=lambda df: df.text.str.split('✈').str[0],                                 finish=lambda df: df.text.str.split('✈').str[-1]) \                         .sort_values('created_at')[['start', 'finish', 'created_at']] 
>>> flights    start finish          created_at 19  AMS    LIS 2018-09-08 05:00:32 18  LIS    GIG 2018-09-08 11:34:14 17  SDU    EZE 2018-09-12 23:29:52 16  EZE    SCL 2018-09-16 17:30:01 15  SCL    LIM 2018-09-19 16:54:13 14  LIM    MEX 2018-09-22 20:43:42 13  MEX    CUN 2018-09-25 19:29:04 11  CUN    MAN 2018-09-29 20:16:11 

Then I’ve found a json dump with airports, made a little hack with replacing Ezeiza with Buenos-Aires and found cities with lengths of stay from flights:

flights = flights.assign(     start=flights.start.apply(lambda code: iata_to_city[re.sub(r'\W+', '', code)]),  # Removes leftovers of emojis, iata_to_city available in the gist     finish=flights.finish.apply(lambda code: iata_to_city[re.sub(r'\W+', '', code)])) cities = flights.assign(     spent=flights.created_at - flights.created_at.shift(1),     city=flights.start,     arrived=flights.created_at.shift(1), )[["city", "spent", "arrived"]] cities = cities.assign(left=cities.arrived + cities.spent)[cities.spent.dt.days > 0] 
>>> cities               city           spent             arrived                left 17  Rio De Janeiro 4 days 11:55:38 2018-09-08 11:34:14 2018-09-12 23:29:52 16  Buenos-Aires   3 days 18:00:09 2018-09-12 23:29:52 2018-09-16 17:30:01 15  Santiago       2 days 23:24:12 2018-09-16 17:30:01 2018-09-19 16:54:13 14  Lima           3 days 03:49:29 2018-09-19 16:54:13 2018-09-22 20:43:42 13  Mexico City    2 days 22:45:22 2018-09-22 20:43:42 2018-09-25 19:29:04 11  Cancun         4 days 00:47:07 2018-09-25 19:29:04 2018-09-29 20:16:11  >>> cities.plot(x="city", y="spent", kind="bar",                 legend=False, title='Cities') \           .yaxis.set_major_formatter(formatter)  # Ugly hack for timedelta formatting, more in the gist 

Cities

Now it’s time to work with photos. I’ve downloaded all photos from Google Photos, parsed creation dates from Exif, and “joined” them with cities by creation date:

raw_photos = pd.DataFrame(list(read_photos()), columns=['name', 'created_at'])  # read_photos available in the gist  photos_cities = raw_photos.assign(key=0).merge(cities.assign(key=0), how='outer') photos = photos_cities[     (photos_cities.created_at >= photos_cities.arrived)     & (photos_cities.created_at <= photos_cities.left) ] 
>>> photos.head()                           name          created_at  key            city           spent             arrived                left 1   photos/20180913_183207.jpg 2018-09-13 18:32:07  0    Buenos-Aires   3 days 18:00:09 2018-09-12 23:29:52 2018-09-16 17:30:01 6   photos/20180909_141137.jpg 2018-09-09 14:11:36  0    Rio De Janeiro 4 days 11:55:38 2018-09-08 11:34:14 2018-09-12 23:29:52 14  photos/20180917_162240.jpg 2018-09-17 16:22:40  0    Santiago       2 days 23:24:12 2018-09-16 17:30:01 2018-09-19 16:54:13 22  photos/20180923_161707.jpg 2018-09-23 16:17:07  0    Mexico City    2 days 22:45:22 2018-09-22 20:43:42 2018-09-25 19:29:04 26  photos/20180917_111251.jpg 2018-09-17 11:12:51  0    Santiago       2 days 23:24:12 2018-09-16 17:30:01 2018-09-19 16:54:13 

After that I’ve got the amount of photos by city:

photos_by_city = photos \     .groupby(by='city') \     .agg({'name': 'count'}) \     .rename(columns={'name': 'photos'}) \     .reset_index() 
>>> photos_by_city              city  photos 0  Buenos-Aires    193 1  Cancun          292 2  Lima            295 3  Mexico City     256 4  Rio De Janeiro  422 5  Santiago        267 >>> photos_by_city.plot(x='city', y='photos', kind="bar",                         title='Photos by city', legend=False) 

Cities

Let’s go a bit deeper and use image recognition, to not reinvent the wheel I’ve used a slightly modified version of TensorFlow imagenet tutorial example and for each photo find what’s on it:

classify_image.init() tags = tagged_photos.name\     .apply(lambda name: classify_image.run_inference_on_image(name, 1)[0]) \     .apply(pd.Series)  tagged_photos = photos.copy() tagged_photos[['tag', 'score']] = tags.apply(pd.Series) tagged_photos['tag'] = tagged_photos.tag.apply(lambda tag: tag.split(', ')[0]) 
>>> tagged_photos.head()                           name          created_at  key            city           spent             arrived                left       tag     score 1   photos/20180913_183207.jpg 2018-09-13 18:32:07  0    Buenos-Aires   3 days 18:00:09 2018-09-12 23:29:52 2018-09-16 17:30:01  cinema    0.164415 6   photos/20180909_141137.jpg 2018-09-09 14:11:36  0    Rio De Janeiro 4 days 11:55:38 2018-09-08 11:34:14 2018-09-12 23:29:52  pedestal  0.667128 14  photos/20180917_162240.jpg 2018-09-17 16:22:40  0    Santiago       2 days 23:24:12 2018-09-16 17:30:01 2018-09-19 16:54:13  cinema    0.225404 22  photos/20180923_161707.jpg 2018-09-23 16:17:07  0    Mexico City    2 days 22:45:22 2018-09-22 20:43:42 2018-09-25 19:29:04  obelisk   0.775244 26  photos/20180917_111251.jpg 2018-09-17 11:12:51  0    Santiago       2 days 23:24:12 2018-09-16 17:30:01 2018-09-19 16:54:13  seashore  0.24720 

So now it’s possible to find things that I’ve taken photos of the most:

photos_by_tag = tagged_photos \     .groupby(by='tag') \     .agg({'name': 'count'}) \     .rename(columns={'name': 'photos'}) \     .reset_index() \     .sort_values('photos', ascending=False) \     .head(10) 
>>> photos_by_tag             tag  photos 107  seashore    276    76   monastery   142    64   lakeside    116    86   palace      115    3    alp         86     81   obelisk     72     101  promontory  50     105  sandbar     49     17   bell cote   43     39   cliff       42 >>> photos_by_tag.plot(x='tag', y='photos', kind='bar',                        legend=False, title='Popular tags') 

Popular tags

Then I was able to find what I was taking photos of by city:

popular_tags = photos_by_tag.head(5).tag popular_tagged = tagged_photos[tagged_photos.tag.isin(popular_tags)] not_popular_tagged = tagged_photos[~tagged_photos.tag.isin(popular_tags)].assign(     tag='other') by_tag_city = popular_tagged \     .append(not_popular_tagged) \     .groupby(by=['city', 'tag']) \     .count()['name'] \     .unstack(fill_value=0) 
>>> by_tag_city tag             alp  lakeside  monastery  other  palace  seashore city                                                              Buenos-Aires    5    1         24         123    30      10       Cancun          0    19        6          153    4       110      Lima            0    25        42         136    38      54       Mexico City     7    9         26         197    5       12       Rio De Janeiro  73   45        17         212    4       71       Santiago        1    17        27         169    34      19      >>> by_tag_city.plot(kind='bar', stacked=True) 

Tags by city

Although the most common thing on this plot is “other”, it’s still fun.

Gist with full sources.

Planet Python

Windows PowerShell and the Text-to-Speech REST API (Part 3)

Summary: Use Windows PowerShell to access the Cognitive Services Text-to-Speech API.

Q: Hey, Scripting Guy!

I was reading up on how we could use PowerShell to communicate with Azure to gain an access token. I’ve been just itching to see how we can use this! Would you show us some examples of this in use in Azure?

—TL

A: Hello TL, I would be delighted to! This is a cool way to play with PowerShell as well!

If you remember from the last post, when we authenticated with the following lines to Cognitive Services, it returned a temporary access token.

Try

{

[string]$ Token=$ NULL

# Rest API Method

[string]$ Method='POST'

# Rest API Endpoint

[string]$ Uri=' https://api.cognitive.microsoft.com/sts/v1.0/issueToken'

# Authentication Key

[string]$ AuthenticationKey='13775361233908722041033142028212'

# Headers to pass to Rest API

$ Headers=@{'Ocp-Apim-Subscription-Key' = $ AuthenticationKey }

# Get Authentication Token to communicate with Text to Speech Rest API

[string]$ Token=Invoke-RestMethod -Method $ Method -Uri $ Uri -Headers $ Headers

}

Catch [System.Net.Webexception]

{

Write-Output 'Failed to Authenticate'

}

The token was naturally stored in the object $ Token, so it was easy to remember. I suppose we could have named it $ CompletelyUnthinkableVariableThatIsPointless, but we didn’t. Because we can use pretty descriptive names in PowerShell, we should. It makes documenting a script easier.

Our next task is to open up the documentation on the Cognitive Services API to see what information we need to supply. We can find everything we need to know here.

Under “HTTP headers,” we can see several pieces of information we need to supply.

HTTP headers table

X-Microsoft-OutputFormat is the resulting output for the file returned.

There are many industry standard types we can use. You’ll have to play with the returned output to determine which one meets your needs.

I found that ‘riff-16khz-16bit-mono-pcm’ is the format needed for a standard WAV file. I chose WAV specifically because I can use the internal Windows services to play a WAV file, without invoking a third-party application.

We’ll assign this to an appropriately named object.

$ AudioOutputType=’riff-16khz-16bit-mono-pcm’

Both X-Search-AppId and X-Search-ClientID are just unique GUIDs that identify your application. In this case, we’re referring to the PowerShell script or function we’re creating.

The beautiful part is that you can do this in PowerShell right now, by using New-Guid:

Screenshot of PowerShell

If you’d like to be efficient and avoid typing (well, unless you do data entry for a living and you need to type it…I once had that job!), we can grab the Guid property and store it on the clipboard.

New-Guid | Select-Object -ExpandProperty Guid | Set-Clipboard

But the GUID format needed by the REST API requires only the number pieces. We can fix that with a quick Replace method.

(New-Guid | Select-Object -ExpandProperty Guid).replace(‘-‘,”) | Set-Clipboard

Run this once for each property, and paste it into a well-descriptive variable, like so:

$ XSearchAppID=’dccd93ecb3cf4535aac9350c9b5fb2f8′

$ XSearchClientID=’45b403b6ae0d4f9ca13ca05f61a58ab2′

UserAgent is just a unique name for your application. Pick a unique but sensible name.

$ UserAgent=’PowerShellTextToSpeechApp’

Finally, Authorization is that token that was generated earlier, and is stored in $ Token.

At this point, we put the headers together. Do you remember the last headers from authentication? It was small, but the format is the same.

$ Headers=@{‘Ocp-Apim-Subscription-Key’ = $ AuthenticationKey }

You can string it all together like this:

$ Headers=@{‘Property1’=’Value’;’Property2’=’Value’;’Property3’=’Value’;’Property4’=’Value’;}

But as you add more information, it becomes too unreadable for others working on your script. This is a great case for using backticks ( ` ) to separate the content out. Every time I think about backticks, I think of Patrick Warburton as “The Tick.”

Here is an example with the same information, spaced out with a space and then a backtick.

$ Headers=@{ `

‘Property1’=’Value’; `

‘Property2’=’Value’; `

‘Property3’=’Value’; `

‘Property4’=’Value’; `

}

Let’s populate the values for our header from the examples I provided earlier in this page.

$ AudioOutputType=’riff-16khz-16bit-mono-pcm’

$ XSearchAppID=’dccd93ecb3cf4535aac9350c9b5fb2f8′

$ XSearchClientID=’45b403b6ae0d4f9ca13ca05f61a58ab2′

$ UserAgent=’PowerShellTextToSpeechApp’

 

$ Header=@{ `

‘Content-Type’ = ‘application/ssml+xml’; `

‘X-Microsoft-OutputFormat’ = $ AudioOutputType; `

‘X-Search-AppId’ = $ XSearchAppId; `

‘X-Search-ClientId’ = $ XSearchClientId; `

‘Authorization’ = $ AccessToken `

}

With the header populated, we are now ready to proceed to our next major piece: actually taking text and converting it to audio content, by using Azure.

But we’ll touch upon that next time. Keep watching the blog and keep on scripting!

I invite you to follow the Scripting Guys on Twitter and Facebook. If you have any questions, send email to them at scripter@microsoft.com, or post your questions on the Official Scripting Guys Forum.

Sean Kearney, Premier Field Engineer, Microsoft

Frequent contributor to Hey, Scripting Guy!

 

 

Hey, Scripting Guy! Blog

PowerTip: Ensure that errors in PowerShell are caught

Summary: Here’s how to make sure your errors get caught with Try Catch Finally.

   Hey, Scripting Guy! I’ve been trying to use the Try Catch Finally, but some of my errors aren’t getting caught. What could be the cause?

   For Try Catch Finally, you need to make sure the error defaults to a “Stop” action for the cmdlet in question. Here’s a quick example:

try

{

Get-Childitem c:\Foo -ErrorAction stop

}

catch [System.Management.Automation.ItemNotFoundException]

{

'oops, I guess that folder was not there'

}

 

Drawing of Dr. Scripto

 

 

Hey, Scripting Guy! Blog

Windows PowerShell and the Text-to-Speech REST API (Part 4)

Summary: Send and receive content to the Text-to-Speech API with PowerShell.

Q: Hey, Scripting Guy!

I was playing with the Text-to-Speech API. I have it almost figured out, but I’m stumbling over the final steps of formatting the SSML markup language. Could you lend me a hand?

—MD

A: Hello MD,

Glad to lend a hand to a Scripter in need! I remember having that same challenge the first time I worked with it. It’s actually not hard, but I needed a sample to work with.

Let’s first off remember where we were last time. We’ve accomplished the first two pieces for Cognitive Services Text-to-Speech:

  1. The authentication piece, to obtain a temporary token for communicating with Cognitive Services.
  2. Headers containing the audio format and our application’s unique parameters.

Next, we need to build the body of content we need to send up to Azure. The body contains some key pieces:

  • Region of the speech (for example, English US, Spanish, or French).
  • Text we need converted to speech.
  • Voice of the speaker (male or female).

For more information about all this, see the section “Supported locales and voice fonts” in Bing text to speech API.

The challenge I ran into was in just how to create the SSML content that was needed. SSML, which stands for Speech Synthesis Markup Language, is a standard for identifying just how speech should be spoken. Examples of this would be:

  • Content
  • Language
  • Speed

I could spend a lot of time reading up on it, but Azure gives you a great tool to create sample content without even trying! Check out Bing Speech, and look under the heading “Text to Speech.” In the text box, type in whatever you would like to hear.

In the sample below, I have entered in “Hello everyone, this is Azure Text to Speech.”

Screenshot of Bing Speech

Now if you select View SSML (the blue button), you can see the code in SSML that would have been the body we would have sent to Azure.

Screenshot of SSML code

You can copy and paste this into your editor of choice. From here, I will try to break down the content from our example.

<speak version=”1.0″ xmlns=”http://www.w3.org/2001/10/synthesis” xmlns:mstts=”http://www.w3.org/2001/mstts” xml:lang=“en-US”><voice xml:lang=“en-US” name=“Microsoft Server Speech Text to Speech Voice (en-US, JessaRUS)”>Hello everyone, this is Azure Text to Speech</voice></speak>

The section highlighted in GREEN is our locale. The BLUE section contains our service name mapping. The locale must always be matched with the same service name mapping from the row it came from. The double quotes are also equally important.

If you mix them up, Azure will wag its finger at you and give a nasty error back.

The section in RED is the actual content that Azure would like us to convert to speech.

Let’s take a sample from the table, and change this to an Australian female voice.

Table with two rows

We first replace the locale with “en-AU,” and then the service name mapping with “Microsoft Server Speech Text to Speech Voice (en-AU, Catherine).”

<speak version=”1.0″ xmlns=”http://www.w3.org/2001/10/synthesis” xmlns:mstts=”http://www.w3.org/2001/mstts” xml:lang=“en-AU”><voice xml:lang=“en-AU” name=” Microsoft Server Speech Text to Speech Voice (en-AU, Catherine)”>Hello everyone, this is Azure Text to Speech</voice></speak>

Now if we’d like to have her say something different, we just change the content in red.

How does this translate in Windows PowerShell?

We can take the three separate components (locale, service name mapping, and content), and store them as objects.

$ Locale=‘en-US’

$ ServiceNameMapping=‘Microsoft Server Speech Text to Speech Voice (en-US, JessaRUS)’

$ Content=‘Hello everyone, this is Azure Text to Speech’

Now you can have a line like this in Windows PowerShell to dynamically build out the SSML content, and change only the pieces you typically need.

$ Body='<speak version=”1.0″ xmlns=”http://www.w3.org/2001/10/synthesis” xmlns:mstts=”http://www.w3.org/2001/mstts” xml:lang=”‘+$ locale+'”><voice xml:lang=”‘ +$ locale+'” name=’+$ ServiceNameMapping+’>’+$ Content+'</voice></speak>’

At this point, we only need to call up the REST API to have it do the magic. But that is for another post!

See you next time when we finish playing with this cool technology!

I invite you to follow the Scripting Guys on Twitter and Facebook. If you have any questions, send email to them at scripter@microsoft.com, or post your questions on the Official Scripting Guys Forum.

Sean Kearney, Premier Field Engineer, Microsoft

Frequent contributor to Hey, Scripting Guy!

Hey, Scripting Guy! Blog